December 02, 2020

OpenStack Superuser

OpenStack in Production and Integration with Ceph: A European Weather Cloud User Story

European Centre for Medium range Weather Forecasts (ECMWF), an intergovernmental organisation, was established in 1975. Based in Reading, UK and the data center soon moving to Bologna Italy, ECMWF spans 34 States in Europe. It operates one of the largest supercomputer complexes in Europe and the world’s largest archive of numerical weather prediction data. In terms of its IT infrastructure, ECMWF’s HPC (High-performance computing) facility is one of the largest weather sites globally. With cloud infrastructure for Copernicus Climate Change Service (C3S), Copernicus Atmosphere Monitoring Service (CAMS) and WEkEO, which is a Data and Information Access Service (DIAS) platform, and the European Weather Cloud, teams at ECMWF maintain an archive of climatological data with a size of 250 PB with a daily growth of 250TB.

ECMWF’s production workflow: from data acquisition to data dissemination, distributed via internet leased lines Regional Meteorological Data Communication Network (RMDCN) to the end users.

European Weather Cloud:

The European Weather Cloud started three years ago in a collaboration between ECMWF and The European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) aiming to make it easier to work on weather and climate big data in a cloud-based infrastructure. With the goal of bringing the computation resources (Cloud) closer to their Big data (meteorological archive and satellite data), ECMWF’s pilot infrastructure was with open source software – Ceph and OpenStack using TripleO.

The graph below shows the current state of the European Weather Cloud overall infrastructure comprising two OpenStack clusters: one built with OpenStack Rocky and another one with OpenStack Ussuri. The total hardware of the current configuration comprises around 3,000 vCPUs, 21 TB RAM for both clusters, 1PB of storage and 2×5 NVIDIA Tesla V100 GPUs.

Integration with Ceph:

The graph below shows the cloud infrastructure of the European Weather Cloud. As you can see, Ceph is built and maintained separately from OpenStack which gives the teams at the European Weather Cloud a lot of flexibility in building different clusters on the same Ceph storage. Both of its OpenStack clusters use the same Ceph infrastructure and the same rbd pools. Besides some usual HDD failures, Ceph performs very well, and the teams at the European Weather Cloud are planning to gradually move to CentOS8 (due to partial support of CentOS7) and upgrade to Octopus and cephadm on a live cluster after a lot of testing to their development environment.

European Weather Cloud infrastructure

OpenStack with Rocky version:

The first OpenStack cluster in the European Weather Cloud, which was built in September 2019, is based on Rocky with TripleO installer. In the meantime, engineers at the European Weather Cloud also created another development environment with OpenStack and Ceph clusters that are similarly configured for testing-experimentation.

Experience and Problems:

Their deployment has about 2,600 VCPUs with 11TB RAM which doesn’t have any significant problem. The external Ceph cluster integration worked with minimum effort by simply configuring the ceph-config.yaml with little modifications. The two external networks (one public facing and another for fast access to their 300PB data archive) were straightforward. 

Most of their VMs are attached to both external networks with no floating IPs, which was a challenging VM routing issue without dynamic routing on the switches. To solve this issue, they used dhcp hooks and configured VM routing before they made the images available to the users.

There were some problems that they encountered with the NIC bond interface configuration and in conjunction with the configuration of their switches at the beginning. Therefore, the engineers decided to not use a Link Aggregation Control Protocol (LACP) configuration, and now they have a single network interface card (NIC) deployment for OpenStack. They also encountered some problems with Load-Balancing-as-a-Service (LBaas) due to Octavia problems with certificates overridden on each deployment. The problem is presented here

As soon as they found the solutions to these challenges, the engineers updated their live system and moved the whole cluster from a Single NIC to a Multiple NIC deployment which is transparent to the users with zero downtime. The whole first cluster was redeployed, and the network was re-configured with Distributed Virtual Routing (DVR) configuration for better network performance. 

OpenStack update efforts from Stein-Train-Ussuri:

In March 2020, engineers at the European Weather Cloud added more hardware for their OpenStack and Ceph cluster, and they decided to investigate upgrading to the newest versions of OpenStack. 

Experience and Problems:

First, they converted their Rockey undercloud to a VM for better management and as a safety net for backups and recovery. From March to May 2020, they investigated and tested upgrading to Stein (first undercloud and then overcloud upgrade to a test environment). Updating was possible from Rocky to Stein to Train and finally to Ussuri, but due to the CentOS7 to CentOS8 transition since Ussuri was based on CentOS8, it was considered impractical. Therefore, they made a big jump from Rocky to Ussuri by skipping three updates and decided to directly deploy their new systems on OpenStack Ussuri.

OpenStack Ussuri Cluster:

The second OpenStack cluster, based on Ussuri, was first built in May 2020, 17 days after the release of Ussuri on May 13. This cluster was a plain vanilla configuration which means that although the network was properly configured with OVN and provider networks with 25 nodes they haven’t had any integration with Ceph storage. 

Experience and Problems:

The new building method that was based on Ansible instead of Mistral had some hiccups, such as the switch from stack to heat-admin which is not what the users are used to deploy. In addition, they were trying to quickly understand and master the CentOS8 base operating system for both the host systems and service containers. Engineers at the European Weather Cloud also continued with OVS instead of OVN because of the implications in assigning floating IP addresses. With the help from the OpenStack community, the problems were solved, and the cluster was built again in mid-June 2020.

Regarding GPUs, the configuration of Nvidia GPUs was straightforward. However, since they haven’t implemented IPv6 to their Ussuri cluster when they installed and configured the GPUs drivers to a node, OVS was trying to bind to IPv6 addresses during the booting time which results in a considerable increase in booting time. A workaround was to explicitly remove PIv6 configuration to their GPU nodes. All nodes with a GPU also resolved as normal compute nodes, and they have configured nova.conf with their Ansible playbooks.  

GPU profiles assignment to VMs based on specific VM flavors for each profile

Future Next Steps:

In terms of the European Weather Cloud’s infrastructure, the engineers are planning to integrate the infrastructure with other internal systems for better monitoring and logging. They are also planning to phase out the Rocky cluster and move all the nodes to Ussuri. Trying to follow the latest versions of OpenStack and Ceph, they will continue to operate, maintain and upgrade the Cloud’s infrastructure.

For the federation, the goal is to federate their Cloud infrastructure with infrastructures of their Member states. They have identified and will continue to explore potential good use cases to federate. 

Regarding the integration with other projects, the European Weather Cloud will be interfacing with the Digital Twin Earth which is a part of the Destination Earth Program of the EU. 

Teams at the European Weather Cloud are also planning to contribute code and help other users that are facing the same problems while deploying clusters in the OpenStack community.

Get Involved:

Watch this session video and more on the Open Infrastructure Foundation YouTube channel. Don’t forget to join the global Open Infrastructure community, and share your own personal open source story using #WeAreOpenInfra on social media.

Special thanks to our 2020 Open Infrastructure Summit sponsors for making the event possible:

  • Headline: Canonical (ubuntu), Huawei, VEXXHOST
  • Premier: Cisco, Tencent Cloud
  • Exhibitor: InMotion Hosting, Mirantis, Red Hat, Trilio, VanillaStack, ZTE

The post OpenStack in Production and Integration with Ceph: A European Weather Cloud User Story appeared first on Superuser.

by Superuser at December 02, 2020 02:00 PM

Opensource.com

Set up OpenStack on a Raspberry Pi cluster

In the year since the Raspberry Pi 4 was released, I've seen many tutorials (like this and this) and articles on how well the 4GB model works with container platforms such as Kubernetes (K8s), Lightweight Kubernetes (K3s), and Docker Swarm. As I was doing research, I read that Arm processors are "first-class citizens" in OpenStack.

by ajscanlas at December 02, 2020 08:01 AM

December 01, 2020

OpenStack Superuser

Embarking on a New Venture: Creating a Private Cloud with OpenStack for Under $1700

I have two business ideas to explore, and I decided that now is a good time to take the plunge and create a prototype. My hesitation throughout the last year was due to the time and financial investment required. After some inspiration, detailed thought, and self-evaluation, I am ready to go for it. Worst case scenario, this is going to eat up a lot of my time. Even if I lose time, I will learn a lot about cloud infrastructure, cloud networking, and cloud instance provisioning. My first business idea is in the realm of home and small business network cyber security. The second utilizes a private cloud platform to provision labs for IT and cyber security training. A small virtual lab isn’t going to cut it for these ventures.

My Current Setup

Before I can pursue these builds, I need to upgrade my home network and lab and select a platform. I currently have 3 old used servers (2 Dell PowerEdge R510s and an HP Proliant DL360) for the cloud. For networking, I have an ancient Cisco switch. I think I can get by with the old switch for now, but my small private cloud requires more servers. I can use the private cloud to provision networks to test out capabilities, learn, and design. These can also hold prototypes and proof of concepts for demonstrations. For the private cloud, I selected OpenStack as my platform. This will allow me to provision instances using Terraform, and have more flexibility with networking configuration. I can also avoid a large AWS and Azure bill while I experiment with different configurations. The only thing that will suffer is my power bill.

These are my Dell R510s and Cisco 3560, forgive the mess, straightening this out is part of the project.

Project Goals

Based on the OpenStack documentation I will need at least 4-5 servers to support my configuration which is a small compute cloud. To use Juju and Metal as a Service (MAAS) to deploy the cloud, I will need 2 more servers, but I could probably use one of my servers and host 2 VMs instead of purchasing another server. I haven’t yet decided whether I am going to use Juju and MAAS to deploy OpenStack, but I do know that I need at least 2 more servers for my project. I also want to separate my private cloud from the rest of my network and still maintain network performance with the added security, so I will need a firewall / IPS appliance. Once complete, my home network will look something like this:

The private cloud will be located on a DMZ allowing me to apply different security standards.

My Private Cloud Budget

I am trying to stay under $2,000 total for this project (including what I already spent). Below is the price I paid for everything I already have.

Device Qty Unit Cost Shipping Total Cost
HP ProLiant DL360 1 $149.99 $112.89 $262.88
Dell PowerEdge R510 2 $238.99 $75.00 $552.98
Cisco Catalyst 3560 1 $69.00 $17.95 $86.95
Total Cost $902.81

Existing devices with costs at the time of purchase

So, based on that I have about $1100 to spend. Although I have plenty of room, I am sticking with used equipment. The only exception I am making is my firewall appliance.

Purchasing New Equipment

I was able to find 2 Dell PowerEdge R610s for $157 each, well within budget. My shipping costs to my location are really high, so I have to keep that in mind. Even with the shipping costs, I still consider these a bargain and they meet my needs. These servers also come from the same vendor as my previous purchases (PC Server and Parts), so I know they will arrive in good condition and operate well.

Dell PowerEdge R610 server

Next I need a firewall appliance, for this I am going straight to a vendor because their site is a lot cheaper than Amazon. This appliance from Protectli has 4 NICs, a quad core processor, and a small SSD. This is more than enough to run pfsense (and it was already tested for it), so it will easily meet my needs and be a step up from my current options for under $300.

Protectli Firewall Appliance

Total Costs

With those 2 purchases, I have all the equipment I will need, and significantly under my max budget! The only other purchase I might make is a rack to store the equipment and a PDU. For now, I just have to wait for them to arrive. I plan to start sometime in December. While I wait, I am going to work on my remote access solutions, determine what IDS/IPS I am going to use (Suricata, Snort, or Bro), and finalize my design of how this will all fit together.

Device Qty Unit Cost Shipping Total Cost
HP ProLiant DL360 1 $149.99 $112.89 $262.88
Dell PowerEdge R510 2 $238.99 $75.00 $552.98
Cisco Catalyst 3560 1 $69.00 $17.95 $86.95
Protectli FW4B 1 $282.00 $7.00 $289.00
Dell PowerEdge R610 2 $156.99 $111.00 $424.98
Total Cost $1616.79

Existing devices with costs at time of purchase

This article was originally posted on mattglass-it.com. See the original article here.

The post Embarking on a New Venture: Creating a Private Cloud with OpenStack for Under $1700 appeared first on Superuser.

by Matt Glass at December 01, 2020 02:00 PM

November 30, 2020

VEXXHOST Inc.

Why OpenStack Private Cloud for DevOps is the Way Forward

Why-OpenStack-Private-Cloud-for-DevOps-is-the-Way-Forward

OpenStack private cloud for DevOps is gaining much traction even among fierce competition. The flexible nature of the open source platform allows DevOps engineers to innovate from time to time. OpenStack also maximizes existing infrastructure and helps engineers tackle untoward incidents with ease.

Agile Development with OpenStack

OpenStack emerged and established itself as a gold standard in building private clouds, among other Infrastructure as a Service (IaaS) platforms. The open source elements of the platform allow engineers to act autonomously to provision and de-provision cloud environments. OpenStack works as a self-service mechanism with all the flexibility cloud builders need. Another advantage is that engineers being able to provision things reduces downstream bottlenecks for the operations team.

OpenStack is not just open source but also vendor-agnostic. This enables the end-user to take full advantage of competitive pricing. There is no vendor lock-in with OpenStack. The availability of a private cloud at prices comparable to public clouds works great for organizations with large-scale data needs.

Another significant feature of OpenStack private cloud, compared to a public cloud, is its ability to have more control in optimizing application performance and security. Companies with sensitive data to handle prefer OpenStack private clouds for DevOps and further use for the same reason.

Beginning the Journey with an OpenStack Private Cloud 

In the initial phases, the integration of OpenStack clouds might seem like a challenge to enterprises used to traditional IT infrastructures. But, an experienced cloud provider can make this process a breeze. Once the company makes it clear what they want in their cloud for DevOps and later use, the provider takes care of the rest. The flexibility of OpenStack really comes in handy here as it allows tailoring the platform according to individual needs.

Moreover, OpenStack also comes with regular updates and releases across the board frequently. The cloud provider ensures that enterprises get these upgrades promptly so that operations run smoothly with the latest technology.

For compute, storage, and network, OpenStack is clearly one of the leaders in the game, with its flexibility and vendor-agnostic nature. The fact that developers are able to create cloud environments with high agility is invaluable for DevOps.

VEXXHOST Private Cloud for DevOps

VEXXHOST’s is an established cloud provider with a decade-worth of experience in OpenStack. We build and deploy private clouds for DevOps according to varying specifications and requirements from clients worldwide. We also provide Managed Zuul, a specialized tool that can accompany DevOps cycles. Talk to our team for further assistance. Check our private cloud resources page to learn more about highly secure cloud environments. 

You’ve got Big Data?
We’ve got your Big Data Infrastructure Solution in a free ebook!

The post Why OpenStack Private Cloud for DevOps is the Way Forward appeared first on VEXXHOST.

by Athul Domichen at November 30, 2020 09:14 PM

November 26, 2020

VEXXHOST Inc.

A Brief History of the Open Infrastructure Foundation

A-Brief-History-of-The-Open-Infrastructure-Foundation

A Brief History of the Open Infrastructure Foundation

Open Infrastructure Foundation is an entity supporting open source development in IT infrastructure globally.

For almost a decade, the governing body of OpenStack and related projects was the OpenStack Foundation. Recently, at the Open Infrastructure Summit 2020, the foundation announced that it has evolved into the Open Infrastructure Foundation.

This move is part of the multi-year drive of community evolution and expansion into including newer projects under the foundation’s wing. In this context, let’s take a look at the timeline that led to this evolution.

The Beginning

The origin of OpenStack, and later the foundation, can be traced back to something that happened in a true open source fashion – a collaboration.

Rackspace was rewriting the infrastructure code (what was later known as Swift) to its cloud offerings. They decided to make the existing code open source. Simultaneously, through its contractor Anso Labs, NASA did the same with the Python-based cloud fabric controller Nova. The teams realized that both the projects are complementary and decided to collaborate. This shared program marked the beginning of OpenStack.

Tech professionals from 25+ companies attended the first OpenStack Design Summit, held in July 2010 in Austin, Texas. Team VEXXHOST joined the OpenStack community by the time the second OpenStack project, Bexar, was released.

OpenStack and the community were growing, and there was a need to promote and develop projects in a more sustainable and organized manner. This thought resulted in the creation of the OpenStack Foundation in September 2012.

The Foundation

The creation of the OpenStack Foundation was a defining moment for cloud computing users across the globe. The foundation launched with over 5,600 members representing numerous companies. We are proud to say that VEXXHOST was also part of it all from the very beginning.

To govern OpenStack and other open source projects better, the foundation set up three bodies under its wing – the Board of Directors, the Technical Committee, and the User Committee. Over the years, the foundation grew with the growth of the projects. Recently there arose a need to build a larger umbrella to adopt and develop more open source projects. Hence, the OpenStack Foundation evolved into the Open Infrastructure Foundation, with OpenStack still being in the heart of it all.

The Summits

The first OpenStack Summit was held in Paris in 2014. The event changed its name to the Open Infrastructure Summit with its Denver edition in 2019. Held almost bi-annually, the summits have always given timely boosts to the development of open sources. The global community of OpenStack developers, contributors, and users come together during the summits and share their ideas collectively. VEXXHOST is a regular presence at the summits and won the SuperUser Award at the Denver Summit, 2019.

The Open Infrastructure Summit was held virtually from 19th to 23rd October 2020, owing to the pandemic. The foundation announced its evolution and name change at the Summit and was greeted with much fanfare.

VEXXHOST and Open Infrastructure Foundation

VEXXHOST was a Corporate Member of the OpenStack Foundation for many years. Our association with the OpenStack community began in 2011, and we’ve been a part of the journey so far as an avid contributor and user. With the latest evolution, we are proud to be a Founding Silver Member of the Open Infrastructure Foundation and accompany it to new heights of open source development.

VEXXHOST has a wide range of cloud solutions powered by OpenStack and other open source projects, including a fully customizable private cloud. If you have further queries on our services, contact us, and we’ll get back to you.

The post A Brief History of the Open Infrastructure Foundation appeared first on VEXXHOST.

by Athul Domichen at November 26, 2020 09:43 PM

November 24, 2020

OpenStack Superuser

Running Percona Kubernetes Operator for Percona XtraDB Cluster with Kata Containers

Kata containers are containers that use hardware virtualization technologies for workload isolation almost without performance penalties. Top use cases are untrusted workloads and tenant isolation (for example in a shared Kubernetes cluster). This blog post describes how to run Percona Kubernetes Operator for Percona XtraDB Cluster (PXC Operator) using Kata containers.

Prepare Your Kubernetes Cluster

Setting up Kata containers and Kubernetes is well documented in the official github repo (cri-ocontainerdKubernetes DaemonSet). We will just cover the most important steps and pitfalls.

Virtualization Support

First of all, remember that Kata containers require hardware virtualization support from the CPU on the nodes. To check if your linux system supports it run on the node.

$ egrep ‘(vmx|svm)’ /proc/cpuinfo

VMX (Virtual Machine Extension) and SVM (Secure Virtual Machine) are Intel and AMD features that add various instructions to allow running a guest OS with full privileges, but still keeping host OS protected.

For example, on AWS only i3.metal and r5.metal instances provide VMX capability.

Containerd

Kata containers are OCI (Open Container Interface) compliant, which means that they work pretty well with CRI (Container Runtime Interface) and hence well supported by Kubernetes. To use Kata containers please make sure your Kubernetes nodes run using CRI-O or containerd runtimes.

The image below describes pretty well how Kubernetes works with Kata.

Hint: GKE or kops allows you to start your cluster with containerd out of the box and skip manual steps.

Setting Up Nodes

To run Kata containers, k8s nodes need to have kata-runtime installed and runtime configured properly. The easiest way is to use DaemonSet which installs required packages on every node and reconfigures containerd. As a first step apply the following yamls to create the DaemonSet:

$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/packaging/master/kata-deploy/kata-rbac/base/kata-rbac.yaml
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/packaging/master/kata-deploy/kata-deploy/base/kata-deploy.yaml

DaemonSet reconfigures containerd to support multiple runtimes. It does that by changing /etc/containerd/config.toml. Please note that some tools (ex. kops) keep containerd in a separate configuration file config-kops.toml. You need to copy the configuration created by DaemonSet to the corresponding file and restart containerd.

Create runtimeClasses for Kata. RuntimeClass is a feature that allows you to pick runtime for the container during its creation. It has been available since Kubernetes 1.14 as Beta.

$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/packaging/master/kata-deploy/k8s-1.14/kata-qemu-runtimeClass.yaml

Everything is set. Deploy test nginx pod and set the runtime:

$ cat nginx-kata.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-kata
spec:
  runtimeClassName: kata-qemu
  containers:
    - name: nginx
      image: nginx

$ kubectl apply -f nginx-kata.yaml
$ kubectl describe pod nginx-kata | grep “Container ID”
    Container ID:   containerd://3ba8d62be5ee8cd57a35081359a0c08059cf08d8a53bedef3384d18699d13111

On the node verify if Kata is used for this container through ctr tool:

# ctr --namespace k8s.io containers list | grep 3ba8d62be5ee8cd57a35081359a0c08059cf08d8a53bedef3384d18699d13111
3ba8d62be5ee8cd57a35081359a0c08059cf08d8a53bedef3384d18699d13111    sha256:f35646e83998b844c3f067e5a2cff84cdf0967627031aeda3042d78996b68d35 io.containerd.kata-qemu.v2cat 

Runtime is showing kata-qemu.v2 as requested.

The current latest stable PXC Operator version (1.6) does not support runtimeClassName. It is still possible to run Kata containers by specifying io.kubernetes.cri.untrusted-workload annotation. To ensure containerd supports this annotation add the following into the configuration toml file on the node:

# cat <> /etc/containerd/config.toml
[plugins.cri.containerd.untrusted_workload_runtime]
  runtime_type = "io.containerd.kata-qemu.v2"
EOF

# systemctl restart containerd

Install the Operator

We will install the operator with regular runtime but will put the PXC cluster into Kata containers.

Create the namespace and switch the context:

$ kubectl create namespace pxc-operator
$ kubectl config set-context $(kubectl config current-context) --namespace=pxc-operator

Get the operator from github:

$ git clone -b v1.6.0 https://github.com/percona/percona-xtradb-cluster-operator

Deploy the operator into your Kubernetes cluster:

$ cd percona-xtradb-cluster-operator
$ kubectl apply -f deploy/bundle.yaml

Now let’s deploy the cluster, but before that, we need to explicitly add an annotation to PXC pods and mark them untrusted to enforce Kubernetes to use Kata containers runtime. Edit deploy/cr.yaml:

pxc:
  size: 3
  image: percona/percona-xtradb-cluster:8.0.20-11.1
  …
  annotations:
 
      io.kubernetes.cri.untrusted-workload: "true"

Now, let’s deploy the PXC cluster:

$ kubectl apply -f deploy/cr.yaml

The cluster is up and running (using 1 node for the sake of experiment):

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
pxc-kata-haproxy-0                                 2/2     Running   0          5m32s
pxc-kata-pxc-0                                     1/1     Running   0          8m16s
percona-xtradb-cluster-operator-749b86b678-zcnsp   1/1     Running   0          44m

In crt output you should see percona-xtradb cluster running using Kata runtime:

# ctr --namespace k8s.io containers list | grep percona-xtradb-cluster | grep kata
448a985c82ae45effd678515f6cf8e11a6dfca159c9abf05a906c7090d297cba    docker.io/percona/percona-xtradb-cluster:8.0.20-11.2 io.containerd.kata-qemu.v2

We are working on adding the support for runtimeClassName option for our operators. The support of this feature enables users to freely choose any container runtime.

Conclusions

Running databases in containers is an ongoing trend and keeping data safe is always the top priority for a business. Kata containers provide security isolation through mature and extensively tested qemu virtualization with little-to-none changes to the existing environment.

Deploy Percona XtraDB Cluster with ease in your Kubernetes cluster with our Operator and Kata containers for better isolation without performance penalties.

This article was originally posted on percona.com/blog. See the original article here.

The post Running Percona Kubernetes Operator for Percona XtraDB Cluster with Kata Containers appeared first on Superuser.

by Sergey Pronin at November 24, 2020 02:00 PM

November 22, 2020

Ghanshyam Mann

OpenStack Victoria CI/CD migration from Ubuntu Bionic (LTS 18.04)-> Focal (LTS 20.04)

OpenStack upstream CI/CD tests the things on defined  LTS or stable distribution versions. OpenStack Technical Committee defines each cycle testing runtime. As per OpenStack Victoria testing runtime, defined versions are:

  • Ubuntu 20.04
  • CentOS 8
  • openSUSE Leap 15

Ubuntu Focal (Ubuntu LTS 20.04) was released on April 23, 2020, and in OpenStack Victoria (released on 14th Oct 2020), we migrated the upstream CI/CD testing on the above-defined testing runtime. This work is done as one of the community-wide goals “Migrate CI/CD jobs to new Ubuntu LTS Focal“.

    What is this migration:

OpenStack CI/CD is implemented with Zuul jobs prepare the node to deploy the OpenStack using Devstack and run tests (Tempest or its plugins, project in-tree tests, rally tests etc). Base OS installed on the node is where OpenStack will be deployed by DevStack.

Till OpenStack Ussuri release, the base OS on the majority of the job’s node was Ubuntu Bionic (18.04). So DevStack used to deploy OpenStack on Ubuntu Bionic and then run tests.

With the new version of Ubuntu Focal (20.04), the node’s base OS has been moved from Ubuntu Bionic -> Ubuntu Focal. On every code change, it will make sure OpenStack work properly on Ubuntu Focal.

NOTE: This migration target only zuulv3 native jobs. Legacy jobs are left to keep running on Bionic and plan to be migrated on Focal while they migrate to zuulv3 native jobs . We had another community-goal to migrate all the legacy jobs to zuulv3 native.

 

    Perform migration testing in advance:

We started the work in June and prepared the devstack, tempest, and tox-based base jobs on Focal so that all projects gate can be tested and fixed in advance before devstack and Tempest base jobs merge. The idea behind the advance testing is to avoid or minimize the gate failure in any of the repo under any projects. This advance testing includes integration as well as tox based unit, functional, doc, pep8, and lower-constraint testing.

    Bugs & fixes:

This migration had more things to fix compared to the previous migration from Ubuntu Xenial to Bionic. One reason for that was Ubuntu Focal and python dependencies dropping python2.7 support and MySQL 8.0 compatibility. OpenStack already dropped the Python2.7 in the Ussuri release but OpenStack dependencies lower constraints were not updated to their python-3 only version because many of them were not python3-only at that time.  So in Ubuntu Focal, those dependencies versions are python3-only which caused many failures in our lower constraints jobs.

A few of the key issues we had to fix for this migrations are:

  1. Failing device detachments on Focal: Bug#1882521
  2. Few of the lower constraints are not compatible with python3.8: Bug#1886298.
  3. Migrations broken on MySQL 8.x: story/2007732
  4. Duplicate check constraint name ‘ck_started_before_ended’: story/2008121
  5. No implicit user creation with GRANT syntax in MySQL 8.0: Bug#1885825
  6.  pyflakes till 2.1.0 not compatible with py3.8: Bug#1886296.

    Actual Migration:

Fixing these bugs took a lot of time for us and that is the reason this migration was late and missed the initial deadlines.

  • Tox based job migration happened on Sept 9th.
  • devstack and Tempest base jobs migration happened on Sept 24th.

All the work for this migration are tracked on: https://storyboard.openstack.org/#!/story/2007865

All changes are: https://review.opendev.org/q/topic:%2522migrate-to-focal%2522+(status:open+OR+status:merged)

    How to migrate the third-party CI:

If your 3rd party CI jobs are still not migrated to zuulv3 then you need to first migrate legacy jobs to zuulv3 native. Refer to this community-goal for details.

For zuulv3 native jobs, like upstream jobs you need to switch the jobs nodeset from ubuntu Bionic to ubuntu Focal.

Below Diagram gives a quick glance of changing the nodeset to Ubuntu Focal:

If you want to verify the nodeset used in your zuul jobs, you can see the hostname and label in job-output.txt

In the same way, you can migrate your third-party CI also to Focal. If third-party job is using the base job without overriding the ‘nodeset’ then the job is automatically switched to Focal. If the job overrides the ‘nodeset’ then, you need to switch to Focal nodeset like shown above. All the Ubuntu Focal nodeset for a single node to multinode jobs are defined in devstack.

We encourage all the 3rd party jobs to migrate to Focal asap as devstack will not support the Bionic related fixes from Victoria onwards.

    Dependencies compatibility versions for upgrades to OpenStack Victoria:

There are many dependencies constraints that need to be bumped to upgrade the OpenStack cloud to the Victoria release. To know all those compatible versions, check the project-specific patches merged from here or from this Bug#1886298. This can help to have prepared the smooth upgrades.

    Completion Summary Report:

  • A lot of work was involved in this. There are many changes required for this migration as compare to previous migration from Xenial to Bionic.
    • Many dependencies with lower constraints compatible with py3.8 and Focal distro had to be updated. Almost in all the repos.
    • Mysql 8.0 caused many incompatible DB issues.
  • I missed the original deadline of m-2 to complete this goal due to falling gates and to avoid any gate block for any projects.
  • More than 300 repo had been fixed or tested in advance before migration happened. This helped a lot to keep the gate green during this migration.

 

I would like to convey special Thanks to everyone who helped in this goal and made it possible to complete it in the Victoria cycle itself.

by Ghanshyam Mann at November 22, 2020 02:32 AM

November 20, 2020

VEXXHOST Inc.

Why Cloud Networking with OpenStack Neutron Works Great for Enterprises

Why-Cloud-Networking-with-OpenStack-Neutron-Is-Great-for-Enterprises

Cloud networking is an important element within all types of cloud building – public, private, or hybrid. For private clouds from VEXXHOST, our open source networking choice is OpenStack Neutron. We believe that Neutron brings in great value for enterprises in building the ‘central nervous system’ of their cloud. Let us see why.

Overview of OpenStack Neutron

Neutron is an extremely powerful networking project of OpenStack. It is considered complex by many users but let me reassure you that its capabilities make it a virtual powerhouse like nothing else out there. We have a previous post that lays out the basics of OpenStack Neutron.

OpenStack Neutron can help you create virtual networks, firewalls, routers, and more. It is flexible and secure. With it, OpenStack can offer network-connectivity-as-a-service. Neutron can help other OpenStack projects manage interface devices through the implementation of API.

Here is a breakdown of a few points mentioned above and how they benefit enterprises.

Cloud Networking with Neutron and Benefits for Enterprises

OpenStack Neutron provides cloud tenants with a flexible API, which helps them build strong networking topologies while also allowing them to configure advanced network policies. There is no unnecessary vendor lock-in as well. A use-case scenario of this capability for enterprises is that they can create multi-tier topologies of web applications.

Neutron allows organizations to have peace of mind regarding security and segmentation as it enables single-tenant networks. These fully isolated networks work in a way that’s almost like having your own secure control switch to the servers, with no possibility of someone else accessing. Moreover, segmentation is possible for these connections. This segmentation enables each VM that comes within a given hypervisor the capability to be private to the respective network.

With Neutron for Cloud Networking, enterprises can leverage automatic IP Address Management and ensure consistency. This means that you don’t have to manually manage IP addresses, and it allows for consistency between the system and the documentation. Another advantage of these dynamic IP addresses is that the possibility of manipulating IP via blocking the layer above is eliminated.

Did you know that enterprises can use OpenStack Neutron for bare metal scaling? Yes, it is true. Each rack of the system works as a network of its own. The vast scheduling network enables these racks to be interconnected with each other. This capability also allows the system to assign appropriate IP addresses.

Overall, Neutron works as a safe, reliable, and flexible cloud networking option for businesses.

VEXXHOST and OpenStack Neutron

VEXXHOST provides Neutron as our open source solution for networking with private cloud. We also provide various other OpenStack-based services for our clients across the globe. If you want to know more about our services and solutions, contact our team. Improve your knowledge of private clouds from our ever-evolving and dedicated resource page.

Would you like to know about OpenStack Cloud? So download our white paper and get reading!

conquer-the-competiton-with-openstack-cloud-white-paper-coverpage

Conquer The Competition With OpenStack Cloud

The post Why Cloud Networking with OpenStack Neutron Works Great for Enterprises appeared first on VEXXHOST.

by Athul Domichen at November 20, 2020 09:44 PM

November 18, 2020

Adam Young

Keystone and Cassandra: Parity with SQL

Look back at our Pushing Keystone over the Edge presentation from the OpenStack Summit. Many of the points we make are problems faced by any application trying to scale across multiple datacenters. Cassandra is a database designed to deal with this level of scale. So Cassandra may well be a better choice than MySQL or other RDBMS as a datastore to Keystone. What would it take to enable Cassandra support for Keystone?

Lets start with the easy part: defining the tables. Lets look at how we define the Federation back end for SQL. We use SQL Alchemy to handle the migrations: we will need something comparable for Cassandra Query Language (CQL) but we also need to translate the table definitions themselves.

Before we create the tables, we need to create keyspace. I am going to make separate keyspaces for each of the subsystems in Keystone: Identity, Assignment, Federation, and so on. Here’s the Federated one:

CREATE KEYSPACE keystone_federation WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3'}  AND durable_writes = true;

The Identity provider table is defined like this:

    idp_table = sql.Table(
        'identity_provider',
        meta,
        sql.Column('id', sql.String(64), primary_key=True),
        sql.Column('enabled', sql.Boolean, nullable=False),
        sql.Column('description', sql.Text(), nullable=True),
        mysql_engine='InnoDB',
        mysql_charset='utf8')
    idp_table.create(migrate_engine, checkfirst=True)

The comparable CQL to create a table would look like this:

CREATE TABLE identity_provider (id text PRIMARY KEY , enables boolean , description text);

However, when I describe the schema to view the table defintion, we see that there are many tuning and configuration parameters that are defaulted:

CREATE TABLE federation.identity_provider (
    id text PRIMARY KEY,
    description text,
    enables boolean
) WITH additional_write_policy = '99p'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND extensions = {}
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

I don’t know Cassandra well enough to say if these are sane defaults to have in production. I do know that someone, somewhere, is going to want to tweak them, and we are going to have to provide a means to do so without battling the upgrade scripts. I suspect we are going to want to only use the short form (what I typed into the CQL prompt) in the migrations, not the form with all of the options. In addition, we might want an if not exists  clause on the table creation to allow people to make these changes themselves. Then again, that might make things get out of sync. Hmmm.

There are three more entities in this back end:

CREATE TABLE federation_protocol (id text, idp_id text, mapping_id text,  PRIMARY KEY(id, idp_id) );
cqlsh:federation> CREATE TABLE mapping (id text primary key, rules text,    );
CREATE TABLE service_provider ( auth_url text, id text primary key, enabled boolean, description text, sp_url text, RELAY_STATE_PREFIX  text);

One thing that is interesting is that we will not be limiting the ID fields to 32, 64, or 128 characters. There is no performance benefit to doing so in Cassandra, nor is there any way to enforce the length limits. From a Keystone perspective, there is not much value either; we still need to validate the UUIDs in Python code. We could autogenerate the UUIDs in Cassandra, and there might be some benefit to that, but it would diverge from the logic in the Keystone code, and explode the test matrix.

There is only one foreign key in the SQL section; the federation protocol has an idp_id that points to the identity provider table. We’ll have to accept this limitation and ensure the integrity is maintained in code. We can do this by looking up the Identity provider before inserting the protocol entry. Since creating a Federated entity is a rare and administrative task, the risk here is vanishingly small. It will be more significant elsewhere.

For access to the database, we should probably use Flask-CQLAlchemy. Fortunately, Keystone is already a Flask based project, so this makes the two projects align.

For migration support, It looks like the best option out there is cassandra-migrate.

An effort like this would best be started out of tree, with an expectation that it would be merged in once it had shown a degree of maturity. Thus, I would put it into a namespace that would not conflict with the existing keystone project. The python imports would look like:

from keystone.cassandra import migrations
from keystone.cassandra import identity
from keystone.cassandra import federation

This could go in its own git repo and be separately pip installed for development. The entrypoints would be registered such that the configuration file would have entries like:

[application_credential] driver = cassandra

Any tuning of the database could be put under a [cassandra] section of the conf file, or tuning for individual sections could be in keys prefixed with cassanda_ in the appropriate sections, such as application_credentials as shown above.

It might be interesting to implement a Cassandra token backend and use the default_time_to_live value on the table to control the lifespan and automate the cleanup of the tables. This might provide some performance benefit over the fernet approach, as the token data would be cached. However, the drawbacks due to token invalidation upon change of data would far outweigh the benefits unless the TTL was very short, perhaps 5 minutes.

Just making it work is one thing. In a follow on article, I’d like to go through what it would take to stretch a cluster from one datacenter to another, and to make sure that the other considerations that we discussed in that presentation are covered.

Feedback?

by Adam Young at November 18, 2020 09:41 PM

StackHPC Team Blog

OpenStack in the TOP500

For the first time, the November TOP500 list (published to coincide with Supercomputing 2020) includes fully OpenStack-based Software-Defined Supercomputers:

Drawing on experience including from the SKA Telescope Science Data Processor Performance Prototypting Platform and Verne Global's hpcDIRECT project, StackHPC has helped bootstrap and is providing support for these OpenStack deployments. They are deployed and operated using OpenStack Kayobe and OpenStack Kolla-Ansible.

A key part of the solution is being able to deploy an OpenHPC-2.0 Slurm cluster on server infrastructure managed by OpenStack Ironic. The Dell C6420 servers are imaged with CentOS 8, and we use our OpenHPC Ansible role to both configure the system and build images. Updated images are deployed in a non-impacting way through a custom Slurm reboot script.

With OpenStack in control, you can quickly rebalance what workloads are deployed. Users can move capacity between multiple Bare Metal, Virtual Machine and Container based workloads. In particular, OpenStack Magnum provides on demand creation of Kubernetes clusters, an approach popularised by CERN.

In addition to user workloads, the solution interacts with iDRAC and Redfish management interfaces to control server configurations, remediate faults and deliver overall system metrics. This was critical in optimising the data centre environment and resulted in the high efficiency achieved in the TOP500 list.

Redfish telemetry gathered while running LINPACK benchmarks

For more details, please watch our recent presentation from the OpenInfra Summit:

Get in touch

If you would like to get in touch we would love to hear from you. Reach out to us via Twitter or directly via our contact page.

by John Garbutt at November 18, 2020 11:00 AM

November 16, 2020

OpenStack Blog

Wallaby vPTG Summaries

The OpenStack community had its second virtual Project Teams Gathering (PTG) following the Open Infrastructure Summit in October. Over 500 individuals and 46 teams (30+ OpenStack teams) across the globe, met and collaborated at the vPTG. Since the event concluded, several of those teams have posted summaries of the discussions they have had and the... Read more »

by Kendall Nelson at November 16, 2020 08:39 PM

RDO

RDO Victoria Released

RDO Victoria Released

The RDO community is pleased to announce the general availability of the RDO build for OpenStack Victoria for RPM-based distributions, CentOS Linux and Red Hat Enterprise Linux. RDO is suitable for building private, public, and hybrid clouds. Victoria is the 22nd release from the OpenStack project, which is the work of more than 1,000 contributors from around the world.

The release is already available on the CentOS mirror network at http://mirror.centos.org/centos/8/cloud/x86_64/openstack-victoria/.

The RDO community project curates, packages, builds, tests and maintains a complete OpenStack component set for RHEL and CentOS Linux and is a member of the CentOS Cloud Infrastructure SIG. The Cloud Infrastructure SIG focuses on delivering a great user experience for CentOS Linux users looking to build and maintain their own on-premise, public or hybrid clouds.

All work on RDO and on the downstream release, Red Hat OpenStack Platform, is 100% open source, with all code changes going upstream first.

PLEASE NOTE: RDO Victoria provides packages for CentOS8 and python 3 only. Please use the Train release, for CentOS7 and python 2.7.

Interesting things in the Victoria release include:

  • With the Victoria release, source tarballs are validated using the upstream GPG signature. This certifies that the source is identical to what is released upstream and ensures the integrity of the packaged source code.
  • With the Victoria release, openvswitch/ovn are not shipped as part of RDO. Instead RDO relies on builds from the CentOS NFV SIG.
  • Some new packages have been added to RDO during the Victoria release:
    • ansible-collections-openstack: This package includes OpenStack modules and plugins which are supported by the OpenStack community to help with the management of OpenStack infrastructure.
    • ansible-tripleo-ipa-server: This package contains Ansible for configuring the FreeIPA server for TripleO.
    • python-ibmcclient: This package contains the python library to communicate with HUAWEI iBMC based systems.
    • puppet-powerflex: This package contains the puppet module needed to deploy PowerFlex with TripleO.
    • The following packages have been retired from the RDO OpenStack distribution in the Victoria release:
      • The Congress project, an open policy framework for the cloud, has been retired upstream and from the RDO project in the Victoria release.
      • neutron-fwaas, the Firewall as a Service driver for neutron, is no longer maintained and has been removed from RDO.

Other highlights of the broader upstream OpenStack project may be read via https://releases.openstack.org/victoria/highlights.

Contributors
During the Victoria cycle, we saw the following new RDO contributors:

Amy Marrich (spotz)
Daniel Pawlik
Douglas Mendizábal
Lance Bragstad
Martin Chacon Piza
Paul Leimer
Pooja Jadhav
Qianbiao NG
Rajini Karthik
Sandeep Yadav
Sergii Golovatiuk
Steve Baker

Welcome to all of you and Thank You So Much for participating!

But we wouldn’t want to overlook anyone. A super massive Thank You to all 58 contributors who participated in producing this release. This list includes commits to rdo-packages, rdo-infra, and redhat-website repositories:

Adam Kimball
Ade Lee
Alan Pevec
Alex Schultz
Alfredo Moralejo
Amol Kahat
Amy Marrich (spotz)
Arx Cruz
Bhagyashri Shewale
Bogdan Dobrelya
Cédric Jeanneret
Chandan Kumar
Damien Ciabrini
Daniel Pawlik
Dmitry Tantsur
Douglas Mendizábal
Emilien Macchi
Eric Harney
Francesco Pantano
Gabriele Cerami
Gael Chamoulaud
Gorka Eguileor
Grzegorz Grasza
Harald Jensås
Iury Gregory Melo Ferreira
Jakub Libosvar
Javier Pena
Joel Capitao
Jon Schlueter
Lance Bragstad
Lon Hohberger
Luigi Toscano
Marios Andreou
Martin Chacon Piza
Mathieu Bultel
Matthias Runge
Michele Baldessari
Mike Turek
Nicolas Hicher
Paul Leimer
Pooja Jadhav
Qianbiao.NG
Rabi Mishra
Rafael Folco
Rain Leander
Rajini Karthik
Riccardo Pittau
Ronelle Landy
Sagi Shnaidman
Sandeep Yadav
Sergii Golovatiuk
Slawek Kaplonski
Soniya Vyas
Sorin Sbarnea
Steve Baker
Tobias Urdin
Wes Hayutin
Yatin Karel

The Next Release Cycle
At the end of one release, focus shifts immediately to the next release i.e Wallaby.

Get Started
There are three ways to get started with RDO.

To spin up a proof of concept cloud, quickly, and on limited hardware, try an All-In-One Packstack installation. You can run RDO on a single node to get a feel for how it works.

For a production deployment of RDO, use TripleO and you’ll be running a production cloud in short order.

Finally, for those that don’t have any hardware or physical resources, there’s the OpenStack Global Passport Program. This is a collaborative effort between OpenStack public cloud providers to let you experience the freedom, performance and interoperability of open source infrastructure. You can quickly and easily gain access to OpenStack infrastructure via trial programs from participating OpenStack public cloud providers around the world.

Get Help
The RDO Project has our users@lists.rdoproject.org for RDO-specific users and operators. For more developer-oriented content we recommend joining the dev@lists.rdoproject.org mailing list. Remember to post a brief introduction about yourself and your RDO story. The mailing lists archives are all available at https://mail.rdoproject.org. You can also find extensive documentation on RDOproject.org.

The #rdo channel on Freenode IRC is also an excellent place to find and give help.

We also welcome comments and requests on the CentOS devel mailing list and the CentOS and TripleO IRC channels (#centos, #centos-devel, and #tripleo on irc.freenode.net), however we have a more focused audience within the RDO venues.

Get Involved
To get involved in the OpenStack RPM packaging effort, check out the RDO contribute pages, peruse the CentOS Cloud SIG page, and inhale the RDO packaging documentation.

Join us in #rdo and #tripleo on the Freenode IRC network and follow us on Twitter @RDOCommunity. You can also find us on Facebook and YouTube.

by Amy Marrich at November 16, 2020 02:27 PM

Galera Cluster by Codership

Taking SkySQL for a spin to launch a Galera Cluster

If you haven’t already tried SkySQL, it is worth noting that you can now launch a Galera Cluster. SkySQL is an automated Database as a Service (DBaas) solution to launch a Galera Cluster within Google Cloud Platform. Launching a Galera Cluster is currently a tech preview and you are still eligible for USD$500 worth of credit, which should let you evaluate it for quite a bit.

When you choose Transactions (it also supports Analytics, Both (HTAP) and Distributed SQL which is also Galera Cluster), you’ll notice that you can launch the Galera Cluster tech preview in multiple regions: Americas, APAC, or EMEA. Costs per hour for Sky-4×15 which has 4 vCPUs and 15GB of memory is USD$0.6546/hour/node (and when you think about it, you’re getting a minimum of 3 Galera Cluster nodes and one MaxScale node which acts as a load balancer and endpoint for your application). You’ll also pay a little more for storage (100GB SSD storage is USD$0.0698/hour due to the 3 nodes). So overall, expect an estimated total of USD$1.9638/hour for the Sky-4×15 nodes, and $0.0698/hour for the 100GB storage per node, bringing your total to USD$2.0336/hour.

Once launched, you’ll note that the service state will be pending till all the nodes are launched. During this time you will also have to whitelist your IP addresses that are planning to access the endpoint. Doing so is extremely straightforward as it does automatic detection within the browser for you. You’ll probably need to add a few more for your application and so forth, but this is extremely straightforward and very well documented.

You’re then given temporary service login credentials, and again, it is extremely well documented. You also get an SSL certificate to login with, and considering this is using the cloud, it makes absolute sense.

A quick point to note: you may see such an error when you’re trying to connect to the MaxScale endpoint, especially if you’re using the MySQL 8 client: ERROR 1105 (HY000): Authentication plugin 'MariaDBAuth' failed. The easy fix is of course to use the proper client library. You also automatically get connected to one of the three nodes in the cluster.

Overall, when we evaluated it, you end up with a 10.5.5-3-MariaDB-enterprise-log, which means it also comes with a few handy additions not present in the community versions: GCache encryption, BlackBox, and non-blocking DDL (wsrep_osu_method = NBO is an option). When you run a SHOW VARIABLES you will notice a few new additions, some of which include: wsrep_black_box_name, wsrep_black_box_size, and an obviously new wsrep_provider, ibgalera_enterprise_smm.so.

Why not take SkySQL for a spin? It is a really easy way to launch a Galera Cluster, and you also have a $500 credit. Load some data. Send some feedback. And if you’re interested in learning more, why not attend: Achieving uninterrupted availability with clustering and transaction replay on November 17 at 10 a.m. PT and 4 p.m. CET? This and more will be discussed at the webinar.

by Sakari Keskitalo at November 16, 2020 09:25 AM

November 11, 2020

Fleio Blog

2020.11.1: docker deploy, logging redesign, process clients performance improvements, magnum Kubernetes improvements

Today, 11th of November, 2020, we have released v2020.11.1. The latest version is marked as Stable and can be used for production environment. Docker integration If you didn’t read our latest blog post we want to let you know that you can now install Fleio trough docker and docker compose by running a single command. […]

by Marian Chelmus at November 11, 2020 08:24 AM

November 10, 2020

Alessandro Pilotti

Windows on ARM64 with Cloudbase-Init

ARM servers are more and more present in our day to day life, their usage varying from minimal IoT devices to huge computing clusters. So we decided to put the Windows support for ARM64 cloud images to the test, with two primary focuses:

  • Toolchain ecosystem – Building and running Cloudbase-Init on Windows ARM64
  • Virtualization – Running Windows and Linux virtual machines on Windows ARM64

Our friends from https://amperecomputing.com kindly provided the computing resources that we used to check the current state of Windows virtualization on ARM64.

The test lab consisted of 3 Ampere Computing EMAG servers (Lenovo HR330A – https://amperecomputing.com/emag), each with 32 ARM64 processors, 128 GB of RAM and 512 GB SSD.


Toolchain ecosystem on Windows ARM64: building and running Cloudbase-Init

Cloudbase-Init is a provisioning agent designed to initialize and configure guest operating systems on various platforms: OpenStack, Azure, Oracle Cloud, VMware, Kubernetes CAPI, OpenNebula, Equinix Metal (formerly: Packet), and many others.

Building and running Cloudbase-Init requires going through multiple layers of an OS ecosystem, as it needs a proper build environment, C compiler for Python and Python extensions, Win32 and WMI wrappers, a Windows service wrapper and an MSI installer.

This complexity made Cloudbase-Init the perfect candidate for checking the state of the toolchain ecosystem on Windows ARM64.


Install Windows 10 PRO ARM64 on the EMAG ARM servers

EMAG servers come with CentOS 7 preinstalled, so the first step was to have a Windows ARM64 OS installed on them.

Windows Server ARM64 images are unfortunately not publicly available, so the best option consists in using Windows Insider (https://insider.windows.com/), Windows 10 PRO ARM64 images available for download.

As there is no ISO available on the Windows Insiders website, we had to convert the VHDX to a RAW file using qemu-img.exe, boot a Linux Live ISO which had dd binary tool on it (Ubuntu is great for this) on the EMAG server and copy the RAW file content directly on the primary disk.

For the dd step, we needed a Windows machine where to download / convert the Windows 10 PRO ARM64 VHDX and two USB sticks. One USB stick for the Ubuntu Live ISO and one for the Windows 10 PRO ARM64 RAW file.

Rufus was used for creating the Ubuntu Live ISO USB and copying the RAW file to the other USB stick. Note that one USB stick must be at least 32 GB in size to cover for the ~25GB of the Windows RAW file.

Tools used for the dd step:

After the dd process succeeds, a server reboot was required. The first boot took a while for the Windows device initialization followed by the usual “Out of the box experience”.

The following steps show how we built Cloudbase-Init for ARM64. As a side note, Windows 10 ARM64 has a builtin emulator for x86, but not for x64. Practically, we could run the x86 version of Cloudbase-Init on the system, but it would have run very slow and some features would have been limited by the emulation (starting native processes).


Gather information on the toolchain required to build Cloudbase-Init

The Cloudbase-Init ecosystem consists of these main building blocks:

  • Python for Windows ARM64
  • Python setuptools
  • Python pip
  • Python PyWin32
  • Cloudbase-Init
  • OpenStack Service Wrapper executable

Toolchain required:

  • Visual Studio with ARM64 support (2017 or 2019)
  • git



Python for Windows ARM64

Python 3.x for ARM64 can be built using Visual Studio 2017 or 2019. In our case, we used the freely available Visual Studio 2019 Community Edition, downloadable from https://visualstudio.microsoft.com/downloads/.

The required toolchain / components for Visual Studio can be installed using this vsconfig.txt. This way, we make sure that the build environment is 100% reproducible.

Python source code can be found here: https://github.com/python/cpython.

To make the build process even easier, we leveraged GitHub Actions to easily build Python for ARM64. An example workflow can be found here: https://github.com/cloudbase/cloudbase-init-arm-scripts/blob/main/.github/workflows/build.yml.

Also, prebuilt archives of Python for Windows ARM64 are available for download here: https://github.com/ader1990/CPython-Windows-ARM64/releases.

Notes:


Python setuptools

Python setuptools is a Python package that handles the “python setup.py install” workflow.

Source code can be found here: https://github.com/pypa/setuptools.

The following patches are required for setuptools to work:

Installation steps for setuptools (Python and Visual Studio are required):

set VCVARSALL="C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat"
set CL_PATH="C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\HostX86\ARM64\cl.exe"
set MC_PATH="C:\Program Files (x86)\Windows Kits\10\bin\10.0.17763.0\arm64\mc.exe"

call %VCVARSALL% amd64_arm64 10.0.17763.0 & set

git clone https://github.com/ader1990/setuptools 1>nul
IF %ERRORLEVEL% NEQ 0 EXIT 1

pushd setuptools
    git checkout am_64
    echo "Installing setuptools"
    python.exe bootstrap.py 1>nul 2>nul
    IF %ERRORLEVEL% NEQ 0 EXIT 1

    %CL_PATH% /D "GUI=0" /D "WIN32_LEAN_AND_MEAN" /D _ARM64_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE launcher.c /O2 /link /MACHINE:ARM64 /SUBSYSTEM:CONSOLE /out:setuptools/cli-arm64.exe
    IF %ERRORLEVEL% NEQ 0 EXIT 1

    python.exe setup.py install 1>nul
    IF %ERRORLEVEL% NEQ 0 EXIT 1
popd


Python pip

Python pip is required for easier management of Cloudbase-Init’s requirements installation and wheels building.

Python’s wheel package is required to build wheels. Wheels are the pre-built versions of Python packages. There is no need to have a compiler to install the package from source on the exact system version the wheel has been built for.

Pip sources can be found here: https://github.com/pypa/pip.

The following pip patch is required: https://github.com/ader1990/pip/commit/0559cd17d81dcee43433d641052088b690b57cdd.

The patch introduces two binaries required for ARM64, which were built from: https://github.com/ader1990/simple_launcher/tree/win_arm64

This patched version of pip can use the wheel to create proper binaries for ARM64 (like setuptools).

Installation steps for wheel (Python is required):

echo "Installing pip"
python.exe -m easy_install https://github.com/ader1990/pip/archive/20.3.dev1.win_arm64.tar.gz 1>nul 2>nul
IF %ERRORLEVEL% NEQ 0 EXIT


Python PyWin32

Python PyWin32 package is a wrapper for (almost) all Win32 APIs from Windows. It is a behemoth from the source code perspective, with Cloudbase-Init using a limited amount of Win32 APIs via PyWin32.

Source code can be found here: https://github.com/mhammond/pywin32.

The following patches are required:

Installation steps for PyWin32 (Python 3.8 and Visual Studio 2019 are required):

echo "Installing pywin32"
git clone https://github.com/ader1990/pywin32 1>nul
IF %ERRORLEVEL% NEQ 0 EXIT 1
pushd pywin32
    git checkout win_arm64
    IF %ERRORLEVEL% NEQ 0 EXIT 1
    pushd "win32\src"
        %MC_PATH% -A PythonServiceMessages.mc -h .
    popd
    pushd "isapi\src"
        %MC_PATH% -A pyISAPI_messages.mc -h .
    popd
    mkdir "build\temp.win-arm64-3.8\Release\scintilla" 1>nul 2>nul
    echo '' > "build\temp.win-arm64-3.8\Release\scintilla\scintilla.dll"
    python.exe setup.py install --skip-verstamp
    IF %ERRORLEVEL% NEQ 0 EXIT 1
popd

The build process takes quite a lot of time, at least half an hour, so we took a(nother) cup of coffee and enjoyed the extra time.

The patches hardcode some compiler quirks for Visual Studio 2019 and remove some unneeded extensions from the build. There is work in progress to prettify and upstream the changes.


Cloudbase-Init

Now, as all the previous steps have been completed, it is time to finally build Cloudbase-Init. Thank you for your patience.

Source code can be found here: https://github.com/cloudbase/cloudbase-init

Installation steps for Cloudbase-Init (Python and Visual Studio are required):

echo "Installing Cloudbase-Init"
git clone https://github.com/cloudbase/cloudbase-init 1>nul
IF %ERRORLEVEL% NEQ 0 EXIT 1
pushd cloudbase-init
  echo "Installing Cloudbase-Init requirements"
  python.exe -m pip install -r requirements.txt 1>nul
  IF %ERRORLEVEL% NEQ 0 EXIT 1
  python.exe -m pip install .  1>nul
  IF %ERRORLEVEL% NEQ 0 EXIT 1
popd

After the installation steps were completed, the cloudbase-init.exe AMR64 executable wrapper will be available.


OpenStack Service Wrapper executable

Cloudbase-Init usually runs as a service at every boot. As cloudbase-init.exe is a normal executable, it needs a service wrapper for Windows. A service wrapper is a small program that implements the hooks for the Windows service actions, like start, stop and restart.

Source code can be found here: https://github.com/cloudbase/OpenStackService

The following patch was required: https://github.com/ader1990/OpenStackService/commit/a48c4e54b3f7db7d4df163a6d7e13aa0ead4a58b

For an easier build process, a GitHub actions workflow file can be found here: https://github.com/ader1990/OpenStackService/blob/arm64/.github/workflows/build.yml

A prebuilt release binary for OpenStackService ARM64 is available for download here: https://github.com/ader1990/OpenStackService/releases/tag/v1.arm64


Epilogue

Now we are ready to use Cloudbase-Init for guest initialization on Windows 10 PRO ARM64.

Main takeaways:

  • The main building blocks (Python and Visual Studio) are in great shape to be used for ARM64 applications
  • Some of the Python packages required for Cloudbase-Init still need minor tweaks when it comes to the build process on ARM64.

The post Windows on ARM64 with Cloudbase-Init appeared first on Cloudbase Solutions.

by Adrian Vladu at November 10, 2020 04:10 PM

November 07, 2020

Slawek Kaplonski

Installation of Openshift on the OpenStack cloud

Installation of Openshift on the OpenStack cloud During last “Day of Learning” in Red Hat which was great opportunity for me to spent whole day on learning something new to me and I choose to learn a bit about installation and management of the Openshift cluster. This post is mostly note for myself from what I did during that training. I was using Openshift 4.6.1 and I installed it on the OpenStack based cloud.

November 07, 2020 04:02 PM

November 05, 2020

Fleio Blog

2020.11.0 beta: deploy Fleio with docker, Magnum Kubernetes improvements, features per user group

Fleio version 2020.11.0 is now available in beta and you can test it in your lab environment, since this release is not recommended for production. You can now install Fleio through docker and docker-compose by running a single command. Fleio now works on Ubuntu 20.04 and CentOS 8. Note that we continue to support Fleio […]

by adrian at November 05, 2020 03:38 PM

November 02, 2020

Slawek Kaplonski

Virtual PTG October 2020 - summary of the Neutron sessions

Yet another virtual PTG happened between 26th and 30th of October. Neutron team had sessions in each day of the PTG. Below is summary of what we discussed and agreed during that time. Etherpad with notes from the discussions can be found at OpenDev Etherpad. Retrospective of the Victoria cycle From the good things during Victoria cycle team pointed: Complete 8 blueprints including the Metadata over IPv6, Improved feature parity in the OVN driver, Good review velocity From the not so good things we mentioned:

November 02, 2020 10:16 PM

November 01, 2020

Ghanshyam Mann

Recap of OpenStack policy popup Wallaby Virtual Forum & PTG, 2020

Due to COVID-19 pandemic, OpenStack PTG for Wallaby development planning and discussions are held virtually from Oct 26 – 30th, 2020. Overall we had a good discussion and another successful virtual PTG though most of us missed the face to face interaction. This blog covers the policy popup (consistent RBAC) discussion that happened in the Forum and Monday and Tuesday PTG.

    Forum:

Etherpad

Progress of consistent RBAC:

  • Three projects completed till now – Keystone, Nova and, Cyborg
  • One in progress – Barbican
  • Planning in Wallaby – Neutron, Manila

We discussed the current and possible future challenges while migrating to the new policy. policy file in JSON format is one know and we talked about a current workaround and long term plan.

Deprecation warnings are still an issue and a lot of warnings are logged. This is logged as a nova bug. Also, the HTML version of the policy doc does not have the deprecation rule and reason. Example: https://docs.openstack.org/nova/latest/configuration/policy.html We need to add it in these docs too.

A clear step by step document about how to use system scope in cloud.yaml as well as in general with all migration steps are much needed.

We also asked if any deployment is migrated to new policy but there is none yet.

    PTG:

Etherpad

We carried the Forum sessions discussion in PTG with few extra topics.

    Migrate Default Policy Format from JSON to YAML (All projects)

We talked about it and decided that it will be great to do this in advance before projects start moving towards the new policy. and having this as a community goal in wallaby will help this effort to move faster. I have proposed this as a goal in TC PTG and it was agreed to select it for wallaby (Goal proposal) We also need to update devstack for the neutron policy file which is policy.json.

    Deprecation warning on-demand based on config option?

There is no best solution for deprecation warnings when it is a lot in case of new policy work. We cannot stop logging warnings completely. We discussed an option to provide a config option to disable the warnings (enable by default) and only for default value change, not the policy name change. The policy name change is a little critical even in policy overridden case also that is why we should not make it switchable. This way operators can disable warnings after seeing it for the first time and if it is too noisy.

    New policy in Horizon

It is challenging to adopt new policy in Horizon when some projects have new policies and some not. We left this for now and will continue brainstorming once we do some investigation on the usage of system token for new policy and old one. For now, amotoki proposed a workaround to keep the policy file with the deprecated rule as the overridden rules. –

    What are the expectations from project teams?

In the end, we discussed what all things to be targeted as part of new policy work and what all as a separate effort. Below is the list and I will be documenting it on wiki.

  • Things to do as part of new policy effort:
    • Migrating JSON to YAML (in advance though)
    • Policy Tests coverage
    • System scoped administrator policy support and “reader” role support
    • upgrade-check support when changing the default policy
  • As separate effort:
    • Remove hard-coded admin checks in the API/DB/”Views”
    • Standardizing policy names (depends on projects)

by Ghanshyam Mann at November 01, 2020 06:36 PM

Recap of OpenStack Quality Assurance Wallaby Virtual PTG, 2020

Due to COVID-19 pandemic, OpenStack PTG for Wallaby development planning and discussions are held virtually from Oct 26 – 30th, 2020. Overall we had a good discussion and another successful virtual PTG though most of us missed the face to face interaction. This blog covers the QA (Quality Assurance) discussion that happened on Monday and Tuesday.

Etherpad

    Victoria Retrospective:

We talked about the retrospective of the last cycle, one main challenge we discussed is the slow review in QA
due to fewer contributors. Below are the action items that came up from improving things:

Action Items:

  • microversion need more work also when you try more combination or latest (gmann).
  • Triage bugs in the office hour as 30 min -> update agenda
  • Speed up the review.

 

    Office hour schedule:

With daylight ending in Nov, we decided to shift the QA office hour by an hour late which is at 14 UTC.

 

    Move back the single horizon test in tempest tree from tempest-horizon:

Tempest and Horizon team decided to bring back the single horizon test to Tempest tree and retire the
tempest-horizon plugins which was too much maintenance for single tests. It is unlikely to happen but if
Tempest team in future plan to remove this test then it needs to be consulted with Horizon team first.

Action items:

  • Move test back to Tempest – gmann
  • Deprecate and retire tempest-horizon – amotoki
  • Check with the release team if any other release needed. (amotoki + gmann)
  • modify the ‘horizon-dsvm-tempest-plugin’ job to direct to tests’ new location

 

   Add run_validation as a centralized point for Tempest scenario tests:

For now, validation resources automation and skip ssh part via run_validation is done in API test only and this
proposal is to extend it to scenario tests also. There is no reason not to do that which will help to run tempest
scenario tests on images where ssh is not allowed. There is a most possible situation where complete tests
need to be skipped but we will see case by case and at least skipping automatically will help tester to avoid
explicitly excluding the scenario tests to the skip list.

Action Items:

  • Implement it in scenario tests – paras333
  • Refactor scenario manager’s create_server method to consume common/compute.py – paras333

 

    Patrole stable release:

We are lacking the maintainer in Patrole project which is a big challenge to release a stable version of it. Sonia
and doug (already helping) will try to spend more bandwidth in Patrole. Another
thing to reduce the Patrole tests execution time. We discussed a few options like policy engine to return the
API based on the flag or use fake driver etc but we did not conclude any option yet as they need more investigation.

Action items:

  • Keep eyes on gate and how tests are running and based on that we can make a stable release for Wallaby. – gmann
  • Keep exploring the options for making patrole lightweight. – gmann, doug

 

    Grenade: extending the testing for zero downtime upgrade testing:

This is just a thought. There is TC tag ‘assert:supports-zero-downtime-upgrade’ which is not
used by any of the projects and also we do not have any testing framework which can verify this
tag. We talked about it if we can do such testing in grenade or not. Before moving forward on
discussion and investigation we checked if anyone can volunteer for this work. As of now, no
volunteer.

 

    Use different guest image for gate jobs to run tempest tests:

We only test cirros guest image to create the test VM and all proposal from Paras is to try more
images to enhance the different configuration scenarios in upstream testing. This will help
to able to catch more failure/scenario at the upstream gate itself compare to the current situation where
most of them are reported from downstream testing.

Action items:

  • Add a job as experimental – paras333

 

    Tempest Cred Manager proposed changes:

‘primary’ and ‘alt_promary’ credentials in Tempest are hardcoded to non-admin and they are assigned
configured ‘tempest_role’. There is no way we can assign a different role to both of these creds. The idea here
is to make ‘primary’ and ‘alt_promary’ credentials configurable so that different deployment can configure
these with different roles in various combinations. We will be adding two config option similar to
‘tempest_role’ and default to an empty list so that we continue the default behavior with what we have currently.
So that there is no backward-incompatible change instead it is an additional new thing provided.

Action Items:

  • Doug needs to propose the patch.

 

    Tempest Scenario manager stable version brainstorming:

We talked about a few method changes to make the scenario manager stable. If any method is used among plugins
only and not in Tempest then we do not actually need to move it to Tempest and they can stay on the side of the plugin itself.
We ran out of time here and will continue the brainstorming in office hours or so.
(this etherpad – https://etherpad.opendev.org/p/tempest-scenario-manager)

Action Items:

  • Sonia and Martin will continue working on this.

 

    Wallaby Priority & Planning:

We did not prioritize the things as such but listed all working items in etherpad along with Victoria cycle backlogs.

by Ghanshyam Mann at November 01, 2020 06:18 PM

Recap of OpenStack Technical Committee Wallaby Virtual PTG, 2020

Due to COVID-19 pandemic, OpenStack PTG for Wallaby development planning and discussions are held virtually from Oct 26 – 30th, 2020. Overall we had a good discussion and another successful virtual PTG though most of us missed the face to face interaction. This blog covers the TC discussion that happened on Tuesday and Friday.

 

 Etherpad

    Wallaby Leaderless assignment:

Four projects are still pending for leadership assignment. We discussed the next step. If you would like to help any of these or using in your cloud/distro, this is the right time to step up:

1. Karbor

  • Action item: diablo_rojo will start the retirement process.

2. Qinling

  • Action item: gmann to start the retirement process.

3. Searchlight

  • Action item: gmann to start the retirement process.

4. Placement

  •  Action item: gibi will update TC about the final discussion and accordingly mnaser to propose the patch in governance.

 

    Victoria Retrospective:

There are a couple of things TC finished in last cycle, few of them are:

  • Reduced size of TC to speed up the things.
  • Updated policy to add projects faster.
  • Outlined Distributed Project Leadership Model as an alternate leadership solution of PTL. Oslo will be the first project to try this new model.
  • Normalized retirement processes. All the retired repos cleanup is finished.
  • Tag cleanups & additions This is not complete yet but started with tc-approved release tag in the Victoria cycle.
  • Merged UC with TC. This is one of the key things finished and the best strategy to make users-developers work more closely.

 

    TC Meeting time:

TC is going to try (re-try) the weekly meeting every Thursday 15:00 UTC.

Action Items:

  • mnaser: propose patch.

 

    TC policy regarding OpenStack client:

Forum Discussion

This is one of the important items and still no good progress on this. From TC perspective, we talked about how to ask or motivate projects to migrate to OSC. TC needs to be very clear on whether it’s a strict policy or just guidelines. After a long discussion on this, we are going with the below strategy: These strategies will be documented as TC resolution.

  •  All user docs should have `openstack` client commands.
  •  Focus on user docs first and then ops docs later
  •  All ci-tooling should have `openstack` client usage
  •  Use the process to find what is missing so we can prioritize it.

Action items:

  •  diablo_rojo to work on initial resolution proposal

 

    Finalize the Wallaby cycle goal selection:

There are two proposals for the Wallaby community-wide goals. First is ‘oslo.rootwrap to oslo.privsep’ which is already being discussed in the last cycle also and all set to be selected. 2nd proposal came up during PTG itself. During policy-popup PTG, it came up that deprecating the JSON formatted of policy file will be a good advance step before the projects move to new RBAC policies. This will help operators to smoothly migrate to new policies. This does not involve much work and ok to select as 2nd goal for Wallaby cycle,

TC agreed to have the below goals for the Wallaby cycle:

Action items:

  • TC to merge the patches which selects both goals for the Wallaby cycle.

 

    TC tag cleanup:

In the Victoria cycle, we started the process to audit all the TC tags and start cleanup those. We removed the ‘tc:approved-release’ tag in the Victoria cycle. In this PTG we discussed two more tags.

1. assert:supports-zero-downtime-upgrade:

Currently, there is no project who has this tag and also no testing framework available. Testing for zero downtime is not so easy in upstream testing. We decided to remove this tag as it is advertising something we aren’t doing. If anyone interested to spend time on this in the future then we can add it after projects start testing it and document it.

2. assert:supports-api-interoperability:

This tag is whether project API is interoperable also or not and this is important from the interop trademark program also. We only have Nova having this tag. Our goal is to promote more projects to apply for this tag. During the discussion, we found that we need to clarify this tag more clearly. For example, this tag is not about implementing the microversion but any versioning schema which provides the feature (API changes) discoverability. And also it is about how we change the API not how our APIs are currently. As long as services have some versioning mechanism to discover the changes and follow the API SIG guidelines for interoperability, and test it via branchless testing way, that service is applicable to apply for this tag.

TC will document this tag in a more clear way and encourage each project to start working on applying this tag.

Action Items:

  • Graham to put up a change to update the wording to allow for discovery mechanisms.
  • Should include the release in which they started to assert this tag.
  • Documents that it’s about following the guidelines for future changes not existing one. – gmann
  • Should socialize this out to the projects after we get the documentation improvements landed. – gmann

 

    k8s Community Bridging :

This is one of the exciting discussion for everyone. We hosted this cross-community meeting with Kubernetes steering committee teams. Bob and dims from the Kubernetes steering committee joined us in this discussion. It was started with a quick introduction from both teams and then started the discussion on the below topics:

  • Sharing governance models:

k8s governance hierarchy is LF->CNCF->k8s steering commitee->various SIG/working group. k8s Steering the committee consists of 7 elected members and doesn’t actually influence the direction of the code instead leave it to SIG and arch committee. There is no chair in this committee and host biweekly private as well as public meetings.  Each SIG (repos) team has approver and reviewer roles where the reviewer review the code and the approver are responsible for merging the code. Naser explained the OpenStack governance model.

  •  What are your 3 biggest issues that are barriers to growth/sustainability as a community? as a project?
    • Going up the contribution ladder is hard.
    • Stale/emeritus membership in reviewer/approver level groups.
    • Mono repo makes it so that each SIG involved needs to review which can slow things down.
  • Area/Domain mapping vs service/repo centered teams
    • mono repo forces people to work together and it’s a little more diverse who is reviewing
    • mono repo is bad because sometimes things need to be in multiple places like documentation
    • mono repo struggles with libraries especially on testing the external one.
  • Related Ecosystem projects comparisons

The SIG lead has to sign off and accept that they are willing to take ownership + handle maintenance + releasing etc. It is general consensus that we try to distribute much work as possible to the subgroup and keep it out of k/k. There is general consensus across k8s leadership that work should be delegated out to subgroups. For API interoperability challenge in a distributed model, k8s have conformance tests that exercise the API performance and vendors try and upload conformance results every release or dot release.

  • Common FOSS topics

release and CI part was discussed. and also on how COVID things are impacting community health. k8s community almost lost their independents or part-time contributors. also, the k8s community is doing 3 releases per year compared to 4.

  • Staying connected going forward

To stay connected it will be a good idea if we extend an invite for K8s to join PTG.

 

    Upstream Investment Opportunities:

We have three upstream opportunities defined for 2020 but there is no help on any of these, even in previous (2018, 2019) upstream opportunities also. We started the discussion if we need to continue this for 2021 years also or just stop defining it and decide the area to help when we have someone interested. Before deciding anything on this mnaser will discuss this with the board of directors and get their opinion.

Action Items:

  •  mnaser to find ways to engage with members and talk with them.
  •  mnaser to talk to board about infrastructure
  •  mnaser to talk to board about platinum members contribution levels + what are they doing to drive engagement.

 

    Pop Up Team Check In:

Currently, we have two active popup teams 1. policy 2. Encryption. TC checked the progress on both teams. The policy team is very active and finished some work in Voctoria cycle (cyborg
finished it and Barbican started) and also hosted forum, PTG sessions, and discussed Wallaby development plan. This team host biweekly meeting to discuss and review the progress.
The encryption team is also active. Josephine explained the progress on this. Glance spec is merged.

Both teams will continue in Wallaby cycle also.

 

    Completing the Gerrit breach audit :

There are still 19 teams pending to finish this audit. Those are listed in etherpad

We encourage all those pending teams to finish the audit, TC members will start following with those projects every week.

Action Items:

  • TC to follow up every week on progress

 

    Other topics:

We ran out of time and all the pending (below) topics will be discussed in TC regular meetings. TC will skip this month’s meeting but will have weekly meetings on 12th Nov onwards.

  • Better way to test defined testing runtime
  • Monitoring in OpenStack: Ceilometer + Telemetry + Gnocchi state
  • Stable core team maintainer and process to recruit the new members and adding core team members in stable core list.

 

by Ghanshyam Mann at November 01, 2020 05:55 PM

October 28, 2020

OpenStack Superuser

Virtual Open Infrastructure Summit Recap

Last week, thousands of community members participated in the Open Infrastructure Summit. This time, the commute for over 10,000 attendees was short, because just like every other 2020 conference, the Summit was virtual. Hosted by the Open Infrastructure Foundation (previously the OpenStack Foundation (OSF)—if you missed this announcement, check out the news), the Summit gathered people from over 120 countries and 30 different open source communities to collaborate around the next decade of open infrastructure. 

The hallway track and networking activities were missed, but the week was still full of announcements and new (and growing!) users sharing their open source, production use cases. The event was free to participate, and this was only possible with the support of the Summit sponsors: 

Headline: Canonical (ubuntu), Huawei, VEXXHOST
Premier: Cisco, Tencent Cloud
Exhibitor: InMotion Hosting, Mirantis, Red Hat, Trilio, VanillaStack, ZTE

Below is a snapshot of what you missed and what you may want to rewatch. 

Like I mentioned earlier, the OSF opened the Summit with some big news. Jonathan Bryce, executive director, announced the Open Infrastructure Foundation (OIF) as the successor to the OSF during the opening keynotes on Monday, October 19. With support from over 60 founding members and 105,000 community members, the OIF remains focused on building open source communities to build software that runs in production. Bryce was joined by Mark Collier, OIF COO, who announced that the OIF board of directors approved four new Platinum Members (a record number of new Platinum Members approved at one time): Ant Group, Facebook Connectivity, FiberHome, and Wind River. 

The OIF builds open source communities who write software that runs in production.The OpenStack and Kata Containers communities celebrated software releases, and dozens users shared their open infrastructure production use cases. 

Five days before the Summit, the OpenStack community released its 22nd version, Victoria. In Tuesday’s keynote, Kendall Nelson, chair of the First Contact SIG, talked about some of the features that landed including Ironic features for a smaller standalone footprint and supporting more systems at the edge. There were also features around hardware enablement and supporting FPGAs that she says will continue through the Wallaby cycle, which the upstream developers are discussing at the Project Teams Gathering (PTG) this week. 

Right in time for the Summit, the Kata Containers community released its 2.0 version, including a rewrite of the Kata Containers agent to help reduce the attack surface and reduce memory overhead. The agent was rewritten in Rust, and users will see a 10x improvement in size, from 11MB to 300KB. Xu Wang, a member of the Kata Architecture Committee, joined the keynotes on Monday to talk about how Kata 2.0 is already running in production at Ant Group, home of the largest payment processor in the world as well as other financial services. At Ant Group, Kata Containers is running on thousands of nodes and over 10,000 CPU cores. 

Ant Group is one of many users who shared information around their production use cases. Below are some highlights of the users who spoke. You can now watch all of the breakout and keynote sessions, and there will also be some Forum sessions uploaded in the coming days. 

Production use cases: 

The OIF announced its newest open infrastructure pilot project, OpenInfra Labs, a collaboration among universities and vendors to integrate and optimize open source projects in production environments and publish complete, reproducible stacks for existing and emerging workloads. Michael Daitzman, a contributor to the project, delivered a keynote introducing the project, thanking the community for their work with projects like OpenStack, Kubernetes, and Ceph, and inviting new contributors to get involved.

Magma, an open source mobile packet core project initiated by Facebook Connectivity, was front and center at the Summit last week. In the opening keynotes, Amar Padmanabhan, engineer at Facebook, introduced Magma and shared the community’s mission to bridge the digital divide and connect the next billion people to the Internet. The project was further discussed in a production use case from Mariel Triggs, the CEO of MuralNet, who talked about the connectivity issues that indigneous nations face and how her organization leverages Magma for an affordable way to keep them connected. Boris Renski, founder and CEO of FreedomFi, returned to the Summit keynote stage to show that building an LTE network with Magma is so easy, even a goat could learn to do it. And sure enough, the goat successfully deployed the network. I’m pretty sure the looks on these faces sum it all up. 

Announced a few weeks ago, Verizon is running Wind River’s distribution of StarlingX in production for its 5G virtualized RAN. During Tuesday’s keynote, Ildiko Vancsa talked about their use case and why Verizon relies on StarlingX for ultra low latency, high availability, and zero-touch automated management. 

Over 15 million compute cores are managed by OpenStack around the world. Imtiaz Chowdhury, cloud architect at Workday, talked about how their deployment has contributed to that growth with their own 400,000 core OpenStack deployment. 

Additional OpenStack users talking about their production use cases include: 

Volvo Cars shared their Zuul production use case to kick off the second day of Summit keynotes. Johannes Foufas and Albin Vass talked about how premium cars need premium tools, and Zuul is a premium tool. The team uses Zuul to build several software components including autonomous driving software, and Foufas says speculative merge and the prioritized queue system are two Zuul features their team relies on.  

SK Telecom 5GX Labs won the 2020 Superuser Awards for their open infrastructure use case integrating multiple open source components in production, including Airship, Ceph, Kubernetes, and multiple components of OpenStack.  

This was the first year the Superuser Awards ceremony was only held once, and there were eight organizations who shared production open infrastructure use cases that were reviewed by the community and advisors to determine the winner. 

Learn how the 2020 Superuser Awards nominees are powering their organization’s infrastructure with open source in production: 

If you missed any of the above sessions or announcements, check out the Open Infrastructure Foundation YouTube channel. Then, join the global Open Infrastructure community, and share your own personal open source story using #WeAreOpenInfra on social media.

The post Virtual Open Infrastructure Summit Recap appeared first on Superuser.

by Allison Price at October 28, 2020 08:55 PM

CERN Tech Blog

10 Years of Openstack at CERN - Part 1

In this blog post we will go back 10 years to understand how OpenStack started at CERN. We will explore the motivation, the first installation, the prototypes, the deployment challenges and the initial production architecture. Let’s start! Before OpenStack 10 years ago CERN IT was working to bring virtualization into the CERN data centre. At that time most applications were still deployed directly in the physical nodes. With the start of the LHC (Large Hadron Collider) it was expected that the resource requests from the users and experiments would increase significantly.

by CERN (techblog-contact@cern.ch) at October 28, 2020 03:00 PM

October 26, 2020

VEXXHOST Inc.

OpenStack Victoria is Here! Let’s Get To Know Version 22

openstack victoria update blog header illustration

OpenStack Victoria, the latest version of the global open source project, was released on the 14th of October 2020. This is the 22nd iteration of OpenStack. At VEXXHOST, we couldn’t be more excited about this much-awaited release. We are also proud to inform you, our cloud operations and services are already running with the latest version.

The Release and the Basics

The release of Victoria coincided with the Open Infrastructure Summit 2020, held virtually this time. OpenStack received more than 20,059 code changes for Victoria. These came from 790 developers belonging to 160 organizations in 45 countries. A large global open source community backs OpenStack. The number of contributors mentioned above stabilizes OpenStack’s ranking among the top three open source projects worldwide.

The main theme behind OpenStack Victoria is its work on native integration with Kubernetes. The update also supports diverse architectures and provides enhanced networking capabilities.

OpenStack’s prominent strength is that it optimizes the performance of virtual machines and bare metal, and with the Victoria release, this is further boosted. In addition to several enhancements to OpenStack’s stable and reliable core and highly flexible integration options with other open source projects, the new release offers the following innovative features:

Highlight Features and Benefits of Victoria

  • Enhanced Native integration with Kubernetes

OpenStack Victoria provides greater native integration with Kubernetes through the different modules of the cloud platform. For instance, Ironic deploying bare-metal has been split into several phases to better integrate with Kubernetes and standalone use. This marks an important trend since bare-metal via Ironic saw 66% more activity over the OpenStack Ussuri cycle. It also offers decomposition of the various implementation steps and new possibilities, such as provisioning without credentials for BMC and DHCP-less deployments.

Kuryr, a solution that bridges container framework network models and the OpenStack network bark traction, now supports custom resource definitions (CRDs). Kuryr will no longer use annotations to store data about OpenStack objects in the Kubernetes API. Instead, corresponding CRDs (KuryrPort, KuryrLoadBalancer, and KuryrNetworkPolicy) are created.

Tacker, the OpenStack service for NFV orchestration, now supports additional Kubernetes objects and VNF LCM APIs. It also provides an additional method for reading Kubernetes object files and CNF artifact definitions in the CSAR package. Tacker also offers more extensive standard features for ETSI NFV-SOL (such as lifecycle management, scale-up, and VNF management) and a Fenix plug-in for rolling updates of VNFs using Fenix and Heat.

  • Additional Backing for Architectures and Standards

The Cyborg-AP I now supports a PATCH call that allows direct programming of FPGAs with pre-uploaded bitstreams. The Victoria release also adds support for Intel QAT and Inspur FPGA accelerators.

Octavia now supports HTTP / 2 over TLS based on Application Layer Protocol Negotiation (ALPN). It is now also possible to specify minimum TLS versions for listeners and pools.

Vitrage now supports loading data via the standard TMF639 Resource Inventory Management API.

  • Solutions to Complex Network Problems

Neutron now offers a metadata service that works over IPv6. This service can be used without a config drive in networks that are completely based on IPv6. Neutron is now helping flat network users support Distributed Virtual Routers (DVR), Floating IP port forwarding for the back-end of OVN, and availability zones for routers OVN.

Kuryr now supports automatic detection of the VM bridging interface in nested configurations.

Octavia’s load balancer pools now support version two of the PROXY protocol. This makes it possible to pass client information to participating servers when using TCP protocols. This version provides improved performance when establishing new connections to participating servers using the PROXY protocol, especially while using IPv6.

OpenStack Victoria and VEXXHOST

OpenStack Victoria releases at a time when OpenStack is officially a decade old. VEXXHOST is proud to be a part of the OpenStack Journey for nine out of those ten years. We are always among the earliest to implement the upgrades in our cloud systems. It’s no different with Victoria as well. And we’re very proud to offer the latest OpenStack upgrades for your private cloud and public cloud solutions. Contact us with all your cloud-related queries; we’re all ears!

Would you like to know about OpenStack Cloud? So download our white paper and get reading!

conquer-the-competiton-with-openstack-cloud-white-paper-coverpage

Conquer The Competition With OpenStack Cloud

The post OpenStack Victoria is Here! Let’s Get To Know Version 22 appeared first on VEXXHOST.

by Athul Domichen at October 26, 2020 05:49 PM

October 23, 2020

VEXXHOST Inc.

Open Infrastructure Summit 2020 – Recap of the First-Ever Virtual Summit

open infrastructure summit 2020 recap blog header

And that’s a wrap! Open Infrastructure Summit 2020 comes to an end and it was quite an eventful few days, won’t you agree?

Owing to the pandemic situation, the event was held virtually this time, from 19th to 23rd of October. We definitely missed the face-to-face interaction but feel that the virtual summit was a different kind of vibrant experience altogether. At VEXXHOST, we had even more reason to be proud as we were a headline sponsor this time, and made quite a few exciting announcements during our keynote session.

Headline Sponsor - VEXXHOST

Image Credit: OpenStack

The collective energy we felt from participants from across the world through keynotes, workshops, and at our own virtual booth made for a summit like never before.

An Overview

First of all, we greatly appreciate the spirit of the open source community to really come together, organize, and make an event of this magnitude a grand success. Considering the challenging nature of things due to the pandemic, the effort deserves to be lauded.

Open source developers, IT decision-makers, and operators representing as much as 750 companies spanning 110 countries attended the four-day event.

Members of open source communities such as Ansible, Ceph, Kubernetes, Airship, Docker, ONAP, Kata Containers, OpenStack, Open vSwitch, Zuul, StarlingX, OPNFV, and many more were eager participants of the summit from start to finish.

There were numerous keynotes, forums, sessions, presentations, and workshops on relevant such as Container Infrastructure, 5G, NFV & Edge, Public, Private & Hybrid Clouds, CI/CD, AI, Machine Learning, HPC, and Security.

The Open Infrastructure Summit also saw a huge announcement from the foundation.

It’s the Open Infrastructure Foundation!

During the Summit, the OpenStack Foundation announced its evolution into the ‘Open Infrastructure Foundation’. This move came as a surprise for many but was welcomed with much cheer and fan fervor from attendees. The renaming is part of the foundation’s multi-year community evolution initiative which promises to better the way open source projects work. VEXXHOST congratulates the foundation on this occasion. We are also proud to be partnering in as founding Silver Member of the Open Infrastructure Foundation in this new beginning.

VEXXHOST - OIF Silver Member

VEXXHOST – Silver Member – Open Infrastructure Foundation

The OpenStack Foundation was founded in 2012 to govern the OpenStack projects and several other open source projects that evolved from it. Over the years, Foundation has developed into an entity including much more under its wings. Moreover, modern use case demands placed on infrastructure, such as containers, 5G, machine learning, AI, NFV, edge computing, etc., were also responsible for this shift.

Even with its evolution into Open Infrastructure Foundation, the initiative will still have OpenStack project at its heart. The only difference is that the development and adoption of other projects will get a greater scope and attention as well.

The foundation also announced that even more innovations are planned and will be announced to the community shortly. We can’t wait to see what’s in store.

Speaking of announcements, we had a few important ones during the summit as well.

Team VEXXHOST Takes a Leap Forward

This year, Team VEXXHOST was proud to be a headline sponsor of the summit. We had a virtual booth of our own and interacted with members from various open source communities. We also gave away virtual bags with many exciting offers and credits to people who visited us at our booth.

Mohammed Naser - Keynote - Open Infrastructure Summit 2020

Mohammed Naser – Keynote – Open Infrastructure Summit 2020

Our CEO, Mohammed Naser, delivered a keynote presentation and a talk on Tuesday, October 20th. During the keynote, he announced a revamp of our public cloud offerings and here are the relevant details for you:

  • Our Montreal region gets new servers equipped with 2nd Gen AMD EPYC™ processors
  • Storage at our Montreal region upgraded from SSD to NVMe
  • New aggressive pricing for public cloud offerings
  • A new region in Amsterdam!

Find all the juicy details about our revamp here.

To share our happiness on this occasion, we’re offering free credit to users to experience our OpenStack powered cloud. This free trial will provide you with a straightforward user interface grant you access to all the cool tools you need in a web-based console.

Hind Naser - OIS 2020 Breakout Session

Hind Naser’s breakout session on “The Big Decision When Adopting OpenStack as a Private Cloud”.

On Day 2 of the summit, Hind Naser, our Director of Business Development, presented a breakout session talk on “The Big Decision When Adopting OpenStack as a Private Cloud”. Through the session, Hind provided informative insights to the attendees on the various decisions, limitations, and pitfalls when a user is starting the private cloud journey.

See You at the Next Summit!

We had a great time at Open Infrastructure Summit 2020 with all the new announcements, keynotes, sessions, workshops etc. Thank you one and all, for attending the summit and visiting us at our virtual booth. If you would like to know more about our public cloud, private cloud or other solutions, do contact us!

The post Open Infrastructure Summit 2020 – Recap of the First-Ever Virtual Summit appeared first on VEXXHOST.

by Athul Domichen at October 23, 2020 07:48 PM

Galera Cluster by Codership

Webinar recording: The New Galera Manager Deploys Galera Cluster for MySQL on Amazon Web Services

We have video recording available for you to learn how you can benefit the New Galera Manager. It includes live demo how to install Galera Manager and deploy easily Galera Cluster on Amazon Web Service for Geo-distributed Multi-master MySQL, Disaster Recovery and fast local reads and writes. Now you can monitor and manage your Galera Cluster with Graphical Interface.

“The presentation was great with lots of valuable information. We will definitely try to implement Galera Manager in our environment very soon” stated attendee of the webinar.

Watch Galera Manager webinar recording

Download Galera Manager

by Sakari Keskitalo at October 23, 2020 11:41 AM

October 22, 2020

Corey Bryant

OpenStack Victoria for Ubuntu 20.10 and Ubuntu 20.04 LTS

The Ubuntu OpenStack team at Canonical is pleased to announce the general availability of OpenStack Victoria on Ubuntu 20.10 (Groovy Gorilla) and Ubuntu 20.04 LTS (Focal Fossa) via the Ubuntu Cloud Archive. Details of the Victoria release can be found at:  https://www.openstack.org/software/victoria.

To get access to the Ubuntu Victoria packages:

Ubuntu 20.10

OpenStack Victoria is available by default for installation on Ubuntu 20.10.

Ubuntu 20.04 LTS

The Ubuntu Cloud Archive for OpenStack Victoria can be enabled on Ubuntu 20.04 by running the following command:

sudo add-apt-repository cloud-archive:victoria

The Ubuntu Cloud Archive for Victoria includes updates for:

aodh, barbican, ceilometer, cinder, designate, designate-dashboard, glance, gnocchi, heat, heat-dashboard, horizon, ironic, keystone, magnum, manila, manila-ui, masakari, mistral, murano, murano-dashboard, networking-arista, networking-bagpipe, networking-baremetal, networking-bgpvpn, networking-hyperv, networking-l2gw, networking-mlnx, networking-odl, networking-sfc, neutron, neutron-dynamic-routing, neutron-vpnaas, nova, octavia, octavia-dashboard, openstack-trove, trove-dashboard, ovn-octavia-provider, panko, placement, sahara, sahara-dashboard, sahara-plugin-spark, sahara-plugin-vanilla, senlin, swift, vmware-nsx, watcher, watcher-dashboard, and zaqar.

For a full list of packages and versions, please refer to:

http://reqorts.qa.ubuntu.com/reports/ubuntu-server/cloud-archive/victoria_versions.html

Reporting bugs

If you have any issues please report bugs using the ‘ubuntu-bug’ tool to ensure that bugs get logged in the right place in Launchpad:

sudo ubuntu-bug nova-conductor

Thank you to everyone who contributed to OpenStack Victoria. Enjoy and see you in Wallaby!

Corey

(on behalf of the Ubuntu OpenStack Engineering team)

by coreycb at October 22, 2020 08:11 PM

Adam Young

Adding Nodes to Ironic

TheJulia was kind enough to update the docs for Ironic to show me how to include IPMI information when creating nodes.

To all delete the old nodes

for UUID in `openstack baremetal node list -f json | jq -r '.[] | .UUID' ` ; do openstack baremetal node delete $UUID; done

nodes definition

I removed the ipmi common data from each definition as there is a password there, and I will set that afterwards on all nodes.

{
  "nodes": [
    {
      "ports": [
        {
          "address": "00:21:9b:93:d0:90"
        }
      ],
      "name": "zygarde",
      "driver": "ipmi",
      "driver_info": {
      		"ipmi_address":  "192.168.123.10"
      }
    },
    {
      "ports": [
        {
          "address": "00:21:9b:9b:c4:21"
        }
      ],
      "name": "umbreon",
      "driver": "ipmi",
      "driver_info": {
	      "ipmi_address": "192.168.123.11"
	}
      },	
    {
      "ports": [
        {
          "address": "00:21:9b:98:a3:1f"
        }
      ],
      "name": "zubat",
      "driver": "ipmi",
       "driver_info": {
	      "ipmi_address": "192.168.123.12"
       }
    }
  ]
}

Create the nodes

openstack baremetal create  ./nodes.ipmi.json 

Check that the nodes are present

$ openstack baremetal node list
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name    | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+
| 3fa4feae-0d5c-4e38-a012-29258d40651b | zygarde | None          | None        | enroll             | False       |
| 00965ad4-c972-46fa-948a-3ce87aecf5ac | umbreon | None          | None        | enroll             | False       |
| 8702ea0c-aa10-4542-9292-3b464fe72036 | zubat   | None          | None        | enroll             | False       |
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+

Update IPMI common data

for UUID in `openstack baremetal node list -f json | jq -r '.[] | .UUID' ` ; 
do  openstack baremetal node set $UUID --driver-info ipmi_password=`cat ~/ipmi.password`  --driver-info   ipmi_username=admin   ; 
done

EDIT: I had ipmi_user before and it does not work. Needs to be ipmi_username.

Final Check

And if I look in the returned data for the definition, we see the password is not readable:

$ openstack baremetal node show zubat  -f yaml | grep ipmi_password
  ipmi_password: '******'

Power On

for UUID in `openstack baremetal node list -f json | jq -r '.[] | .UUID' ` ; do  openstack baremetal node power on $UUID  ; done

Change “on” to “off” to power off.

by Adam Young at October 22, 2020 03:14 AM

October 21, 2020

Galera Cluster by Codership

Galera Cluster for MySQL 5.6.49, 5.7.31, and 8.0.21 released

Codership is pleased to announce a new Generally Available (GA) release of the multi-master Galera Cluster for MySQL 5.6, 5.7 and 8.0, consisting of MySQL-wsrep 5.6.49 (release notes, download), 5.7.31 (release notes, download), and 8.0.21 (release notes, download) with Galera Replication library 3.31 (release notes, download) implementing wsrep API version 25 for 5.6 and 5.7, and Galera Replication library 4.6 (release notes, download) implementing wsrep API version 26 for 8.0. This release incorporates all changes to MySQL 5.6.49, 5.7.31 , and 8.0.21 respectively, adding a synchronous option for your MySQL High Availability solutions.

It is recommend that one upgrades their Galera Cluster for MySQL 5.6, 5.7 and 8.0 because it releases a fix for security vulnerability CVE-2020-15180. The binary tarball is also compiled with OpenSSL 1.1.1g.

A highlight of this release is that with MySQL 8.0.21, you will now have access to using the Percona audit log plugin, which will help with monitoring and logging connection and query activity that has been performed on specific servers. This implementation is provided as an alternative to the MySQL Enterprise Audit Log Plugin.

In addition to fixing deadlocks that may occur between DDL and applying transactions, in 8.0.21 the write-set replication patch is now optimised to work with the Contention-Aware Transaction Scheduling (CATS) algorithm that is present in InnoDB. You can read more about transaction scheduling in the MySQL manual.

For those that requested the missing binary tarball package, the MySQL 8.0.21 build includes just that. Packages continue to be available for: CentOS 7 & 8, Red Hat Enterprise Linux 7 & 8, Debian 10, SLES 15 SP1, as well as Ubuntu 18.04 LTS and Ubuntu 20.04 LTS. The latest versions are also available in the FreeBSD Ports Collection.

The Galera Replication library has had some notable fixes, one of which improves memory usage tremendously. The in-memory GCache index implementation now uses sorted std::deque instead of std::map, and this leads to an eightfold reduction in memory footprint. Hardware CRC32 is now supported on x86_64 and ARM64 platforms.

There are also three new status variables added: wsrep_flow_control_active (to tell you whether flow cotrol is currently active (replication paused) in the cluster), wsrep_flow_control_requested (to tell you whether the node has requested a replication pause because the received events queue is too long) and wsrep_gmcast_segment (to tell you which cluster segment the node belongs to).

For Galera Replication library 3.31, this is the last release for Debian Jessie and openSUSE 15.0. For Galera Replication library 4.6, this is the last release for openSUSE 15.0. For MySQL-wsrep 5.6 and 5.7, this is also the last release for Debian Jessie. For MySQL-wsrep 5.7 and MySQL-wsrep 8.0, this is the last release for openSUSE 15.0.

by Sakari Keskitalo at October 21, 2020 11:58 AM

October 20, 2020

OpenStack Superuser

SK Telecom 5GX Cloud Labs wins the 2020 Superuser Awards

SK Telecom Cloud 5GX Labs is the 12th organization to win the Superuser Awards. The news was announced today during the virtual 2020 Open Infrastructure Summit. You can watch the announcement on demand in the Summit platform.

Elected by members of the community, the team that wins the Superuser Awards is lauded for the unique nature of its use case as well as its integration and application of open infrastructure. SK Telecom 5GX Cloud Labs was among eight nominees for the Award this year and is the first to receive the Award for an Airship use case.

In addition to contributing upstream to OpenStack and Airship, an open source project supported by the Open Infrastructure Foundation, SK Telecom developed a containerized OpenStack on Kubernetes solution called SKT All Container Orchestrator (TACO), based on OpenStack-helm and Airship. TACO is a containerized, declarative, cloud infrastructure lifecycle manager that enables them to provide operators the capability to remotely deploy and manage the entire lifecycle of cloud infrastructure and add-on tools and services by treating all infrastructure like cloud native apps. They deployed it to SKT’s core systems including telco mobile network, IPTV services, which currently has 5.5 million subscriptions; also for external customers (next generation broadcasting system, VDI, etc). Additionally, the team strongly engaged in community activity in Korea, sharing all of their technologies and experiences to regional communities (OpenStack, Ceph, Kubernetes, etc).

Just before the big announcement, Jeff Collins and Matt McEuen discussed the upcoming Airship 2.0 release, which is now in beta. Rewatch the announcement now!

 

SK Telecom 5GX Labs celebrates winning the Superuser Awards

 

The post SK Telecom 5GX Cloud Labs wins the 2020 Superuser Awards appeared first on Superuser.

by Ashlee Ferguson at October 20, 2020 04:18 PM

OpenStack Blog

10 Years of OpenStack – Ghanshyam Mann at NEC

Happy 10 years of OpenStack! Millions of cores, 100,000 community members, 10 years of you. Storytelling is one of the most powerful means to influence, teach, and inspire the people around us. To celebrate OpenStack’s 10th anniversary, we are spotlighting stories from the individuals in various roles from the community who have helped to make... Read more »

by Sunny at October 20, 2020 03:00 PM

StackHPC Team Blog

StackHPC is OpenInfra!

Open infrastructure will underpin the next decade of transformation for cloud infrastructure. With the virtual Open Infrastructure Summit well underway, the first major announcement has been the formation of a new foundation, the Open Infrastructure Foundation. StackHPC is proud to be a founding member.

Open Infastructure Foundation

StackHPC's CEO, John Taylor, comments "We are extremely pleased to be a part of the new decade of Open Infrastructure and welcome the opportunity to continue to transfer the values of "Open" to our clients."

StackHPC's CTO, Stig Telfer, recorded a short video describing how the concept of open infrastructure is essential to our work, and how as a company we contribute to open infrastructure as a central part of what we do:

Get in touch

If you would like to get in touch we would love to hear from you. Reach out to us via Twitter or directly via our contact page.

by Stig Telfer at October 20, 2020 11:00 AM

October 19, 2020

VEXXHOST Inc.

What Does OpenStack Foundation’s Change of the Decade Mean to the OpenStack Community?

open infrastructure foundation logo change blog header

You’ve probably heard the news – the Open Stack Foundation has announced its evolution into the Open Infrastructure Foundation. If you haven’t heard yet, well, here you go. The word in tech circles is that it is the biggest change for the Foundation since its inception in 2012 when it was set up to support the OpenStack project and community.

At VEXXHOST, we’re more than excited about the evolution of OSF into OIF as a lot of things are changing for us as well. Here are all the details you need:

A Little Background

The OpenStack Foundation was established in 2012 to further advance the OpenStack project while also developing new projects under a much larger umbrella. During this process and even now, there often arises the confusion of both the foundation and the project being considered the same. For instance, when someone mentions the Foundation the other think of the project and so on. We’ve all been there.

In reality, the Foundation has grown to include much more under its wings and there was a need to make a distinction to enable further development. Moreover, the demands driven by modern use cases such as containers, AI, NFV, machine learning, edge computing, 5G, etc., placed on infrastructure, are also responsible for catalyzing this shift.

In these contexts, the OpenStack Foundation has been implementing a multi-year community evolution initiative and the renaming is a pivotal moment of the operation.

What Changes

Even with the evolution into Open Infrastructure Foundation, the heart of the initiative will still be the OpenStack project. It’s just that the adoption and development of other projects will receive a larger scope and support than before.

The major open source projects that will receive this support are:

  • Airship
  • Kata Containers
  • Zuul
  • OpenInfra Labs
  • Startling X

There are more innovations planned and as the time comes, they will be revealed to the community.

What Changes for VEXXHOST

VEXXHOST has a decade-long association with the OpenStack Foundation now, offering public and private cloud domain solutions in a dependable, consistent manner.  Being an Infrastructure Donor and a Corporate Member, we’ve also associated with the Foundation as a steady contributor and user of OpenStack, other OSF projects, and Zuul. In the OpenStack Summit 2019, VEXXHOST received the Super User Award as well.

The points above reveal our steady partnership with the Foundation. We join the Foundation’s new transformation with great pride and view it as a great opportunity for mutual growth. Not only will this help us in gaining exposure and expand our work within the OpenStack community but also allow us to strengthen our position as a forerunner in the open source industry.

Founding Silver Member

As mentioned before, VEXXHOST has been a Corporate Member of the OpenStack Foundation over the years. With the entity’s evolution into the Open Infrastructure Foundation, we are proud to be accompanying as Founding Silver Member. This means that we will continue to extend our support in contributing to and developing the existing as well as upcoming projects.

Paraphrasing the quote from the classic movie Casablanca (1942), “We think this is the continuation of a beautiful friendship!”

VEXXHOST has a range of cloud offerings powered by OpenStack and other projects governed by the foundation. Schedule a call with us and get access to a free OpenStack environment. If you have any other queries regarding our services, contact us and we’ll get back to you.

Would you like to know about OpenStack Cloud? So download our white paper and get reading!

conquer-the-competiton-with-openstack-cloud-white-paper-coverpage

Conquer The Competition With OpenStack Cloud

The post What Does OpenStack Foundation’s Change of the Decade Mean to the OpenStack Community? appeared first on VEXXHOST.

by Athul Domichen at October 19, 2020 05:13 PM

Galera Cluster by Codership

Effective Monitoring of your Galera Cluster for MySQL with Galera Manager

While we have documented how you might consider Monitoring a Cluster with Galera Manager, we’d also like to take you through a bit more of what is available, before our webinar this week. Please signup for a live demo from install to deployment and management.

You might be used to SHOW GLOBAL STATUS LIKE 'wsrep_%'; from the command line, but why not take a look at all of this on a graph, over time, within the GUI of the Galera Manager? What happens when you feel a node is getting overwhelmed? You tend to check wsrep_flow_control_paused, which returns the percentage of time the node was paused because of Flow Control (normally you do this after a FLUSH STATUS however now you get it graphed over time). Sometimes you want to monitor key metrics like: local state, cluster status & size, flow control being paused, and the local receive/send queue average. Overall, from queue sizes, to flow control, to number of transactions for a node (and in bytes), as well as replication conflicts, Galera Manager can solve for all this with over 600 metrics that you can look out for.

Once you have deployed a cluster in Galera Manager, you will be able to get a general overview from the Monitor. By default, you get to see CPU information, wsrep_received (total number of write-sets received from other nodes), wsrep_replicated(total number of write-sets sent to other nodes, via replication), wsrep_flow_control_paused (time since last FLUSH STATUS that replication was paused due to flow control), and wsrep_flow_control_sent (number of FC_PAUSE events the node has sent).

Connect to one of the nodes (the Configuration tab, will provide your database address and the root password that you can use to connect), and start doing some data insertion. You will then notice that the CPU time, wsrep_received, and even wsrep_replicated will change. For completeness, we also added wsrep_local_commits to monitor, which is the total number of transactions committed.

Since there are hundreds of metrics to chose from (currently 620), why not focus on adding some of your own?

The useful thing about the cluster view is that you can also dive deeper into seeing what is happening on a per node basis. To some extent this can also ensure that all your nodes are being utilised in a efficient manner and there aren’t any preferred nodes from the application or the proxy.

In upcoming releases, we will focus on how you can also monitor the effectiveness of streaming replication, so that you can visually see how the long running transactions are performing and if you need to change fragment sizes appropriately. Let us know what else you would like to see from Galera Manager, by sending us feature requests at: info@codership.com.

For further reading, please visit Monitoring a Galera Cluster, Using Status Variables to monitor a Galera Cluster, and Database Server Logs.

by Sakari Keskitalo at October 19, 2020 06:40 AM

Stephen Finucane

Comparing Nova Database Migrations

One of the goals for the Wallaby release of OpenStack Nova is to compact many of the database migrations that have been slowly building up since the Icehouse release some 6½ years ago.

October 19, 2020 12:00 AM

October 16, 2020

OpenStack Superuser

#OpenInfraSummit Track: Security

The Open Infrastructure Summit, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, private & hybrid cloud and security. Thousands of attendees are expected to participate, representing 30+ open source communities and more than 110 countries.

Today, we are featuring one of the seven Summit tracks—Security. Get your Summit tickets for free and don’t forget to add these sessions to your Summit calendar!

Dependency Analytics

  • Presented by Shubham Mathur & Darshan Vandra from Red Hat
  • What can you expect to learn?
    • Via this session, we want to educate all the developers how they can enhance their programming capabilities by using this extension. It would be a great learning curve for students or those who are just out of college as they get to know or learn about aspects of selecting a package in their project, which they usually are not aware of.
  • Add this session to your Summit calendar!

Digital Sovereignty – Why Open Infrastructure Matters

  • Presented by Marius Feldmann from Cloud & Heat Technologies, Kurt Garloff
  • What can you expect to learn?
    • In 2019 the term “digital sovereignty” gained momentum in Europe. The first time in history, several ministries such as the German or French ministry for Economic Affairs, pushed forward initiatives to establish digital sovereignty. In this talk, we will give an interpretation of this important term. Additionally, we will provide an overview of the European initiative GAIA-X. Last but not least, we will point out, why in our opinion neither digital sovereignty nor the goals of the GAIA-X initiative can be achieved without open infrastructures.

Improvements to Identity Federation in Keystone

  • Presented by Kristi Nikolla from Mass Open Cloud
  • What can you expect to learn?
    • Previous and current state of identity federation in keystone
    • Expiring group memberships and how to carry permissions through mapping rules
    • Operations on the federated attributes of a user through the API
    • What identity federation is, why you would use it, and how to set it up
  • Add this session to your Summit calendar!

Introducing Open Enclave

  • Presented by Aeva Black from Microsoft,
  • What can you expect to learn?
    • In this talk, attendees will learn about the work happening in the Confidential Computing Consortium to secure public cloud workloads using hardware trusted execution enclaves (HW TEE’s), a new CPU capability introduced recently by Intel, AMD, and ARM.
    • More specifically, attendees will learn about the Open Enclave SDK project which can be used to write “enclave aware” applications across multiple CPU architectures, and will be given either a tour of the code or a demo of a running application (depending on the luck of the day).
    • Attendees will leave this talk with enough knowledge to write their own Hello World application using Open Enclave.
  • Add this session to your Summit calendar!

Physical tenant separation: evaluation of concepts and proposed implementations

  • Presented by Laura Geisler from secustack GmbH.
  • What can you expect to learn?
    • Listeners will learn about possible approaches of integrating physical separation concepts into OpenStack and the corresponding advantages and disadvantages of each approach. Furthermore, attendees will get an understanding about how the seemingly contradicting approach of introducing physical exclusivity can still fit into the cloud paradigm.
  • Add this session to your Summit calendar!

Security-by-construction: How to weave authorization into modern app stacks using Open Policy Agent

  • Presented by Tim Hinrichs from Styra
  • What can you expect to learn?
    • Three Key Takeaways:
      • How to leverage the open source tools to build policy-as-code guardrails.
      • Best practices from the community for limiting risk.
      • How to shift security left, and bring policy into your development culture.
    • The audience will see working code snippets and live-coding.
  • Add this session to your Summit calendar!

Tackling security challenges with Airship

  • Presented by Alexander Hughes from Accenture
  • What can you expect to learn?
    • What Common Vulnerability Exposures (CVEs) are
    • How to detect CVEs in the Operating System (such as apt packages)
    • How to detect CVEs in programming language dependencies (such as PyPi)
    • How to scan Docker images for these CVEs, as well as viruses
    • Common ways to resolve CVEs
    • The impact a base image has on your image’s securit
  • Add this session to your Summit calendar!

Towards Enclave-as-a-Container with Inclavare containers and Occlum

  • Presented by Yuntong Jin from Intel, Hongliang Tian from Ant Group and Tianjia Zhang
  • What can you expect to learn?
    • This topic will introduce what’s confidential computing, the architecture of Inclavare containers and how it can be deployed in cloud.
  • Add this session to your Summit calendar!

Don’t miss these Summit sessions and get your Summit ticket for free!

Still on the fence?

View the full Summit schedule and check out some most anticipated Summit sessions that you might love!

Participate:

Follow the #OpenInfraSummit hashtag on Twitter, Facebook, LinkedIn and make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive behind-the-scenes content on how the Summit is being organized!

Participate in the conversation on

Twittertwitter.com/OpenStack

Facebookfacebook.com/OpenStack

YouTubeyoutube.com/user/OpenStackFoundation

WeChat ID: OpenStack

The post #OpenInfraSummit Track: Security appeared first on Superuser.

by Sunny Cai at October 16, 2020 01:00 PM

October 15, 2020

Adam Young

Introduction to Ironic

“I can do any thing. I can’t do everything.”

The sheer number of projects and problem domains covered by OpenStack was overwhelming. I never learned several of the other projects under the big tent. One project that is getting relevant to my day job is Ironic, the bare metal provisioning service. Here are my notes from spelunking the code.

The Setting

I want just Ironic. I don’t want Keystone (personal grudge) or Glance or Neutron or Nova.

Ironic will write files to e.g. /var/lib/tftp and /var/www/html/pxe and will not handle DHCP, but can make sue of static DHCP configurations.

Ironic is just an API server at this point ( python based web service) that manages the above files, and that can also talk to the IPMI ports on my servers to wake them up and perform configurations on them.

I need to provide ISO images to Ironic so it can put the in the right place to boot them

Developer steps

I checked the code out of git. I am working off the master branch.

I ran tox to ensure the unit tests are all at 100%

I have mysql already installed and running, but with a Keystone Database. I need to make a new one for ironic. The database name, user, and password are all going to be ironic, to keep things simple.

CREATE USER 'ironic'@'localhost' IDENTIFIED BY 'ironic';
create database ironic;
GRANT ALL PRIVILEGES ON ironic.* TO 'ironic'@'localhost';
FLUSH PRIVILEGES;

Note that I did this as the Keystone user. That dude has way to much privilege….good thing this is JUST for DEVELOPMENT. This will be used to follow the steps in the developers quickstart docs. I also set the mysql URL in the config file to this

connection = mysql+pymysql://ironic:ironic@localhost/ironic

Then I can run ironic db sync. Lets’ see what I got:

mysql ironic --user ironic --password
#....
MariaDB [ironic]> show tables;
+-------------------------------+
| Tables_in_ironic              |
+-------------------------------+
| alembic_version               |
| allocations                   |
| bios_settings                 |
| chassis                       |
| conductor_hardware_interfaces |
| conductors                    |
| deploy_template_steps         |
| deploy_templates              |
| node_tags                     |
| node_traits                   |
| nodes                         |
| portgroups                    |
| ports                         |
| volume_connectors             |
| volume_targets                |
+-------------------------------+
15 rows in set (0.000 sec)

OK, so the first table shows that Ironic uses Alembic to manage migrations. Unlike the SQLAlchemy migrations table, you can’t just query this table to see how many migrations have been performed:

MariaDB [ironic]> select * from alembic_version;
+--------------+
| version_num  |
+--------------+
| cf1a80fdb352 |
+--------------+
1 row in set (0.000 sec)

Running The Services

The script to start the API server is:
ironic-api -d --config-file etc/ironic/ironic.conf.local

Looking in the file requirements.txt, I see that they Web framework for Ironic is Pecan:

$ grep pecan requirements.txt 
pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.0.0 # BSD

This is new to me. On Keystone, we converted from no framework to Flask. I’m guessing that if I look in the chain that starts with ironic-api file, I will see a Pecan launcher for a web application. We can find that file with

$which ironic-api
/opt/stack/ironic/.tox/py3/bin/ironic-api

Looking in that file, it references ironic.cmd.api, which is the file ironic/cmd/api.py which in turn refers to ironic/common/wsgi_service.py. This in turn refers to ironic/api/app.py from which we can finally see that it imports pecan.

Now I am ready to run the two services. Like most of OpenStack, there is an API server and a “worker” server. In Ironic, this is called the Conductor. This maps fairly well to the Operator pattern in Kubernetes. In this pattern, the user makes changes to the API server via a web VERB on a URL, possibly with a body. These changes represent a desired state. The state change is then performed asynchronously. In OpenStack, the asynchronous communication is performed via a message queue, usually Rabbit MQ. The Ironic team has a simpler mechanism used for development; JSON RPC. This happens to be the same mechanism used in FreeIPA.

Command Line

OK, once I got the service running, I had to do a little fiddling around to get the command lines to work. The was an old reference to

OS_AUTH_TYPE=token_endpoint

which needed to be replaces with

OS_AUTH_TYPE=none

Both are in the documentation, but only the second one will work.

I can run the following commands:

$ baremetal driver list
+---------------------+----------------+
| Supported driver(s) | Active host(s) |
+---------------------+----------------+
| fake-hardware       | ayoungP40      |
+---------------------+----------------+
$ baremetal node list


curl

Lets see if I can figure out from CURL what APIs those are…There is only one version, and one link, so:

curl http://127.0.0.1:6385 | jq '.versions  | .[] | .links | .[] |  .href'

"http://127.0.0.1:6385/v1/"


Doing curl against that second link gives a list of the top level resources:

  • media_types
  • chassis
  • nodes
  • drivers

And I assume that, if I use curl to GET the drivers, I should see the fake driver entry from above:

$ curl "http://127.0.0.1:6385/v1/drivers" | jq '.drivers |.[] |.name'

"fake-hardware"

OK, that is enough to get started. I am going to try and do the same with the RPMs that we ship with OSP and see what I get there.

But that is a tale for another day.

Thank You

I had a conversation I had with Julia Kreger, a long time core member of the Ironic project. This helped get me oriented.

by Adam Young at October 15, 2020 07:27 PM

October 14, 2020

Thomas Goirand

The Gnocchi package in Debian

This is a follow-up from the blog post of Russel as seen here: https://etbe.coker.com.au/2020/10/13/first-try-gnocchi-statsd/. There’s a bunch of things he wrote which I unfortunately must say is inaccurate, and sometimes even completely wrong. It is my point of view that none of the reported bugs are helpful for anyone that understand Gnocchi and how to set it up. It’s however a terrible experience that Russell had, and I do understand why (and why it’s not his fault). I’m very much open on how to fix this on the packaging level, though some things aren’t IMO fixable. Here’s the details.

1/ The daemon startups

First of all, the most surprising thing is when Russell claimed that there’s no startup scripts for the Gnocchi daemons. In fact, they all come with both systemd and sysv-rc support:

# ls /lib/systemd/system/gnocchi-api.service
/lib/systemd/system/gnocchi-api.service
# /etc/init.d/gnocchi-api
/etc/init.d/gnocchi-api

Russell then tried to start gnocchi-api without the good options that are set in the Debian scripts, and not surprisingly, this failed. Russell attempted to do what was in the upstream doc, which isn’t adapted to what we have in Debian (the upstream doc is probably completely outdated, as Gnocchi is unfortunately not very well maintained upstream).

The bug #972087 is therefore, IMO not valid.

2/ The database setup

By default for all things OpenStack in Debian, there are some debconf helpers using dbconfig-common to help users setup database for their services. This is clearly for beginners, but that doesn’t prevent from attempting to understand what you’re doing. That is, more specifically for Gnocchi, there are 2 databases: one for Gnocchi itself, and one for the indexer, which not necessarily is using the same backend. The Debian package already setups one database, but one has to do it manually for the indexer one. I’m sorry this isn’t well enough documented.

Now, if some package are supporting sqlite as a backend (since most things in OpenStack are using SQLAlchemy), it looks like Gnocchi doesn’t right now. This is IMO a bug upstream, rather than a bug in the package. However, I don’t think the Debian packages are to be blame here, as they simply offer a unified interface, and it’s up to the users to know what they are doing. SQLite is anyway not a production ready backend. I’m not sure if I should close #971996 without any action, or just try to disable the SQLite backend option of this package because it may be confusing.

3/ The metrics UUID

Russell then thinks the UUID should be set by default. This is probably right in a single server setup, however, this wouldn’t work setting-up a cluster, which is probably what most Gnocchi users will do. In this type of environment, the metrics UUID must be the same on the 3 servers, and setting-up a random (and therefore different) UUID on the 3 servers wouldn’t work. So I’m also tempted to just close #972092 without any action on my side.

4/ The coordination URL

Since Gnocchi is supposed to be setup with more than one server, as in OpenStack, having an HA setup is very common, then a backend for the coordination (ie: sharing the workload) must be set. This is done by setting an URL that tooz understand. The best coordinator being Zookeeper, something like this should be set by hand:

coordination_url=zookeeper://192.168.101.2:2181/

Here again, I don’t think the Debian package is to be blamed for not providing the automation. I would however accept contributions to fix this and provide the choice using debconf, however, users would still need to understand what’s going on, and setup something like Zookeeper (or redis, memcache, or any other backend supported by tooz) to act as coordinator.

5/ The Debconf interface cannot replace a good documentation

… and there’s not so much I can do at my package maintainer level for this.

Russell, I’m really sorry for the bad user experience you had with Gnocchi. Now that you know a little big more about it, maybe you can have another go? Sure, the OpenStack telemetry system isn’t an easy to understand beast, but it’s IMO worth trying. And the recent versions can scale horizontally…

by Goirand Thomas at October 14, 2020 01:07 PM

OpenStack Superuser

#OpenInfraSummit Track: Private & Hybrid Cloud

The Open Infrastructure Summit, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, private & hybrid cloud and security. Thousands of attendees are expected to participate, representing 30+ open source communities and more than 110 countries.

Today, we are featuring one of the seven Summit tracks—Private & Hybrid Cloud. Get your Summit tickets for free and don’t forget to add these sessions to your Summit calendar!

10 years of OpenStack at CERN (From 0 to 300k cores)

  • Presented by Belmiro Moreira from CERN
  • What can you expect to learn?
    • In this presentation, we will describe the CERN Cloud Infrastructure from its early prototypes to today.
    • We will dive into the history, architecture, tools and technical decisions behind the CERN Cloud Infrastructure over the years.
  • Add this session to your Summit calendar!

Achieve a single OpenStack deployment in multiple datacenter

  • Presented by Victor Coutellier from Societe Generale
  • What can you expect to learn?
    • Learn why we choose to have only one Openstack deployment spanned across all the datacenter of a cloud region, which issues we encountered and how we implemented a solution, either by using built-in features of the product or by developing new one for the community.
  • Add this session to your Summit calendar!

Auto-scaling in OpenStack without Telemetry

  • Presented by Dat Vu Tuan & Kien Nguyen-Tuan from Viettel Network
  • What can you expect to learn?
    • Fundamental knowledge of what auto-scaling is and how auto-scaling works
    • Why using OpenStack Telemetry is not a production-ready for auto-scaling at the moment
    • The auto-scaling architecture consisting of Prometheus, Openstack and Faythe, an opensource software we built ourselves, is doing an excellent job, saving us tons of human labor
    • The superior points of the new stack and how we benefited from it

Back to the original mission: an open cloud operating system

  • Presented by Christian Berendt from Betacloud Solutions GmbH and Kurt Garloff
  • What can you expect to learn?
    • Clear separation between an “cloud operating system” and a “cloud software”
    • Overview of the status of existing solutions with focus on OpenStack
    • Deep understanding of the problems (technical and non technical) of existing solutions, e.g.:
      • missing or not matured toolchains
      • missing standardization
      • missing best practices
      • missing good defaults for the production
      • missing free knowledge
      • missing transparency
      • missing and inferior security
      • insufficient testing and validation
      • too few working updates
      • lack of durability and long-term sustainability
      • Overview of possible solutions (technical and non technical) to the problems identified
  • Add this session to your Summit calendar!

Building the European Weather Cloud with Open source software : An end to end journey.

  • Presented by Vasileios Baousis from ECMWF
  • What can you expect to learn?
    • We will present the approach we followed to deploy the European Weather Cloud infrastructure, the decisions made, the challenges and opportunities and lessons learned from this journey, regarding the Openstack deployment (version, operating system, networking, configuration, upgrade approaches/challenges, and Ceph deployment (version, networking, configuration ), OVS vs OVN, GPUs implementation and integration with our existing systems and services, accessing other internal systems like our ~250PB archive of Meteorological data.
  • Add this session to your Summit calendar!

Casting light on (Bare) Metal: Meet the Ironic Prometheus Exporter!

  • Presented by Iury Gregory Melo Ferreira from Red Hat
  • What can you expect to learn?
    • Baseboard Management Controllers (BMCs) can provide useful monitoring data about physical nodes. This information is beneficial to operators, allowing them to detect issues and take actions when something is wrong (e.g., High temperatures can cause unexpected node shutdowns and you can define different alerts). With the Train release a new way to use this data was born, please welcome the Ironic Prometheus Exporter (IPE)!
    • Ironic builds upon the principles of open source and open standards to enable support for a diverse ecosystem of hardware from a variety of vendors. By using the IPE, you will have multiple metrics for each physical node (e.g., temperature, power) available on Prometheus, as long as the vendor supports IPMI or Redfish protocol.
    • During the session, we will show how to configure IPE, share advantages & limitations, show a demo of how to monitor different physical nodes, and what operators can achieve when using it and close with our plans for further improvements.
  • Add this session to your Summit calendar!

Ceilometer, CloudKitty and Gnocchi: a dynamic and agnostic cloud monitoring and billing stack.

  • Presented by Rafael Weingartner
  • What can you expect to learn?
    • Attendees should expect to get insights on how to address the monitoring and billing needs of their organization, by leveraging OpenSource components such as Gnochi, Ceilometer, and CloudKitty. The talk will present the idea behind a dynamic monitoring system, and how it fits nicely in Cloud companies needs. Moreover, we will showcase real-life use cases of the proposed stack, and how it was leveraged to integrate different system/services into the OpenStack billing solution that was already in place.
  • Add this session to your Summit calendar!

China Mobile software and hardware integrated portable cloud platform test system

Discover OpenStack’s nerve with oslo.metrics: Have a robust private cloud on a large scale

  • Presented by Motomu Utsumi & Reedip Banerjee from LINE Corporation
  • What can you expect to learn?
    • WHow we are collecting metrics from oslo libraries
    • What kind of metrics we are getting from oslo libraries
    • How oslo.metrics can support monitoring and parameter tuning
    • What characteristics of OpenStack’s RPC we observed
  • Add this session to your Summit calendar!

Effective migration from Newton to Queens with minimal downtime

  • Presented by Attila Szlovencsák from GE Digital, Viktor Schlaffer from Nokia Networks
  • What can you expect to learn?
    • Cloud to cloud migration is complex (even if done between two OpenStack platforms)
    • The biggest challenges during migration (and a proposed solution)
    • Scale integration: moving instances between Scalr environments/farms
    • Methods to copy data effectively between regions.
  • Add this session to your Summit calendar!

Elastic Secure Infrastructure (ESI): I Learned to Share my Hardware; You Can Too!

  • Presented by Tzu-Mainn Chen from Red Hat
  • What can you expect to learn?
    • Attendees will learn about the ESI project, and the planning decisions that led to our initial phase of work. Detailed explanations of that work will allow attendees to understand how features will be used for the MOC, and how it can be adapted by other projects as well. Finally, a demo will showcase how all our efforts are coming together to form a hardware leasing system.
  • Add this session to your Summit calendar!

Hassle-free migration from OVS to OVN

  • Presented by Frode Nordahl from Canonical
  • What can you expect to learn?
    • Quick introduction of OVN
    • Planning a migration and steps required prior to a migration
    • Steps required during a migration
    • Demonstration of how the process is automated for OpenStack deployments deployed with OpenStack Charms
  • Add this session to your Summit calendar!

Hybrid Cloud Ecology of Tencent, Reconstructing the future digital infrastructure

  • Presented by Ruan He from Tencent.
  • What can you expect to learn?
    • Tencent has been deploying a hybrid cloud architecture within the company since 2012 and is a pioneer in China’s hybrid cloud construction. Chief Architect Dr. Ruan He will share Tencent’s best  practice experience in hybrid cloud, sharing one-stop hybrid cloud  solutions in terms of cloud native architecture, platform capabilities, and service capabilities, helping traditional industries build more  flexible enterprise IT architectures
  • Add this session to your Summit calendar!

I Don’t Think This Means What You Think It Means: Red Herrings in OpenStack

  • Presented by Florian Haas from City Network International AB
  • What can you expect to learn?
    • OpenStack’s complexity comes with operational challenges. And in situations where OpenStack misbehaves, it is frequently non-trivial to find the actual cause of an issue. This talk includes several examples of red herrings in OpenStack, and suggestions for spotting and avoiding them.
    • This is useful for both OpenStack cloud operators and users of OpenStack clouds.
  • Add this session to your Summit calendar!

Introduction of FaaS (Function as a Service) in StarlingX

  • Presented by Sharath Kumar K & Poornima Nagaraju from Intel.
  • What can you expect to learn?
    • Starlingx architecture and implementation, how FaaS as service will help StarlingX end users to run the code in most effective and cost-efficient manner.

Lessons Learned from a Large Scale OpenStack Deployment with Tripleo

  • Presented by Pradipta Sahoo & Sai Sindhur Malleni from Red Hat.
  • What can you expect to learn?
    • In the session, we will talk about our journey to over 700+ overcloud nodes using Tripleo with OpenStack Ussuri, lessons learned, issues identified and fixed. We will also talk about control-plane performance with real-world use cases

Live migration from VMware to OpenStack

  • Presented by Jack Lee, Brin Zhang from Inspur.
  • What can you expect to learn?
    • How does VMware virtualization use agentless technology and VMware data changed block tracking technology (CBT).
    • Data compatibility issues between OpenStack and VMware.
  • Add this session to your Summit calendar!

OpenStack Cluster Installer: the Debian way to deploy OpenStack

  • Presented by Thomas Goirand from Infomaniak
  • What can you expect to learn?
    • Atttendees will learn about this new deployment solution, how to deploy and use it.
  • Add this session to your Summit calendar!

Performance optimization of Neutron server and Agent for large-scale scenario

  • Presented by Haizhong Qin from Inspur, Yongfeng Du from Intel.
  • What can you expect to learn?
    • Analysis and optimization of the process for neutron control plane including dbcp agent.
    • Analysis and optimization of the process that the ovs agent deal with the change of members in security groups.
  • Add this session to your Summit calendar!

Practice and Thinking of Telecom-cloud Management Platform based on Cloud-native

Private cloud price-performance analysis

  • Presented by Tytus Kurek from Canonical.
  • What can you expect to learn?
    • Attendees will learn about the economics of private clouds vs public clouds and why organisations decide to run their workloads in private clouds. They will study what criteria to take into account when designing a private cloud infrastructure to ensure TCO reduction. They will see how the TCO per VM varies between leading public cloud and private cloud platforms.
  • Add this session to your Summit calendar!

Scaling Bare Metal Provisioning with Nova and Ironic at CERN

  • Presented by Arne Wiebalck & Belmiro Moreira from CERN
  • What can you expect to learn?
    • In this talk you will learn about the issues we encountered and the solutions we deployed to scale our bare metal deployment with Nova and Ironic to several thousand nodes.
  • Add this session to your Summit calendar!

SLO’s for Openstack – the API for Engineering Teams

  • Presented by Kit Merker from Nobl9, Joseph Sandoval from Adobe Systems
  • What can you expect to learn?
    • How to deliver reliable features faster without degrading the customer experience with private cloud. Service Level Objectives (SLOs) are customer-centric goals that define expectations between the stakeholders of your service. SLOs are an essential tool for any Site Reliability Engineering (SRE) team to achieve sustainable customer happiness when running Openstack. After the session you will know how to implement and build a culture around it with your organization .
  • Add this session to your Summit calendar!

Tackling operational and capacity utilization challenges at a large scale enterprise deployment

  • Presented by Bogdan Katyński, Silvano Buback, Imtiaz Chowdhury from Workday
  • What can you expect to learn?
    • Attendees of this presentation can learn:
      • Overview of capacity management, planning and operational challenges an enterprise cloud operator faces with large OpenStack deployments.
      • Concepts like memory fragmentation and blast radius. How to measure them and how they impact server capacity utilization.
      • How Workday manages capacity with multi-cluster deployment across data centers
      • Difference between scheduler filters and weighers.
      • How to simulate/visualize different weighers configurations. We are going to show the tools developed by Workday.
      • Scheduler limitations with very high concurrency deployments and how to overcome.
      • How to write custom weighers and how to deploy them without changing OpenStack original packages.
      • How Workday developed custom weighers to maximize capacity usage while minimizing blast radius and service downtime.
  • Add this session to your Summit calendar!

The amphorae of application delivery & security with Octavia

  • Presented by Saurabh Sureka, Richu Channakeshava, Hunter Thompson from A10 Networks
  • What can you expect to learn?
    • Introduction to Octvaia LBaaS
    • Challenges, best practices and path to migrate to Octavia LBaaS (from Neutron)
    • Enable consistent framework for application delivery and security for hybrid cloud
  • Add this session to your Summit calendar!

The Magic of the Libvirt Driver – What to Do When You Invariably Hit “No Valid Host”

  • Presented by Stephen Finucane from Red Hat
  • What can you expect to learn?
    • Upon conclusion, attendees should have a better understanding of how some of the more obscure race conditions and errors in nova can be triggered and how you can work around them if you hit them. This will be illustrated through a relatively deep dive into nova’s architecture.
  • Add this session to your Summit calendar!

Don’t miss these Summit sessions and get your Summit ticket for free!

Still on the fence?

View the full Summit schedule and check out some most anticipated Summit sessions that you might love!

Participate:

Follow the #OpenInfraSummit hashtag on Twitter, Facebook, LinkedIn and make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive behind-the-scenes content on how the Summit is being organized!

Participate in the conversation on

Twittertwitter.com/OpenStack

Facebookfacebook.com/OpenStack

YouTubeyoutube.com/user/OpenStackFoundation

WeChat ID: OpenStack

The post #OpenInfraSummit Track: Private & Hybrid Cloud appeared first on Superuser.

by Sunny Cai at October 14, 2020 01:00 PM

October 12, 2020

OpenStack Superuser

#OpenInfraSummit Track: Public Cloud

The Open Infrastructure Summit, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, and security. Thousands of attendees are expected to participate, representing 30+ open source communities and more than 110 countries.

Today, we are featuring one of the seven Summit tracks—Public Cloud. Get your Summit tickets for free and don’t forget to add these sessions to your Summit calendar!

Day-3-Operations: New Region and New Services

  • Presented by Clemens Hardewig, Nils Magnus & Sebastian Wenner from T-Systems.
  • What can you expect to learn?
    • How to scale your public cloud (this is not limited to technology)
    • How to anticipate the next ten years of OpenStack infrastructure.
    • To think ahead and deal with the multitude of options on the platform layer.
  • Add this session to your Summit calendar!

Enhancement and Optimization of Computing, Storage and Network for Bare Metal with Hardware Offloading

  • Presented by Yajun Yang & Yunxiang Tao from China Mobile Suzhou Software Technology Co.
  • What can you expect to learn?
    • Public Cloud Bare Metal
    • Hardware offloading
    • VirtIO
    • OVS, DPDK, SPDK
  • Add this session to your Summit calendar!

Managed Kubernetes Service for Public Cloud with Openstack and Kubernetes

  • Presented by Sa Pham from VIetNam Communications Corporation.
  • What can you expect to learn?
    • The attendees can learn how to build a Kubernetes service for their with our architecture and have a good sense of the issues with OpenStack Magnum. They also learn how to integrate OpenStack Senlin with Kubernetes for cluster management and autoscaling ability.
  • Add this session to your Summit calendar!

SCS: Large Federated Infrastructure for Sovereignty

  • Presented by Christian Berendt from Betacloud Solutions GmbH, Kurt Garloff.
  • What can you expect to learn?
    • Attendees will learn why sovereignty matters and why we need Open Infrastructure platforms to achieve it.
    • The presentation will point out the various areas that need to be addressed — technical, political, economical, cultural to allow the project to succeed.
    • It will give an overview over the chosen architecture and technology components and discusses how the project goes beyond previous attempts to create a sustainable well-integrated platform.
    • The presentation will describe the development process and the new way that operators collaborate to share operational knowledge and practices which today still represents one of the main hurdles to be a successful service provider.
    • It will show how an operator may benefit from this and help by contributing to create a larger ecosystem. It will discuss how enforced transparency makes the providers better and how this helps users to trust the platforms.
  • Add this session to your Summit calendar!

Status and Challenges for Interop Certified Operators

  • Presented by Nils Magnus & Vineet Pruthi from T-Systems.
  • What can you expect to learn?
    • How the Interop compliance program works.
    • No longer mixing up RefStack, Interop, certification, testing, and powered-by programs.
    • How to plan ahead your certification.
  • Add this session to your Summit calendar!

Yet Another Monitoring Solution? Why APImon is Different

  • Presented by Artem Goncharov & Nils Magnus from T-Systems.
  • What can you expect to learn?
    • Distinguish different types of monitoring.
    • Tips from features of OpenStackSDK you were not aware of.
    • Gaining insight of real-world API workloads
  • Add this session to your Summit calendar!

Don’t miss these Summit sessions and get your Summit ticket for free!

Still on the fence?

View the full Summit schedule and check out some most anticipated Summit sessions that you might love!

Participate:

Follow the #OpenInfraSummit hashtag on Twitter, Facebook, LinkedIn and make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive behind-the-scenes content on how the Summit is being organized!

Participate in the conversation on

Twittertwitter.com/OpenStack

Facebookfacebook.com/OpenStack

YouTubeyoutube.com/user/OpenStackFoundation

WeChat ID: OpenStack

The post #OpenInfraSummit Track: Public Cloud appeared first on Superuser.

by Superuser at October 12, 2020 01:00 PM

October 08, 2020

OpenStack Blog

10 Years of OpenStack – Xiaoguang Zhang at China Mobile

Happy 10 years of OpenStack! Millions of cores, 100,000 community members, 10 years of you. Storytelling is one of the most powerful means to influence, teach, and inspire the people around us. To celebrate OpenStack’s 10th anniversary, we are spotlighting stories from the individuals in various roles from the community who have helped to make... Read more »

by Sunny at October 08, 2020 05:57 PM

October 07, 2020

VEXXHOST Inc.

Join Us at the Open Infrastructure Summit 2020 and Here’s Why

open infrastructure summit 2020 vexxhost blog header

It’s that time of the year again when the OpenStack community across the world comes together. Yes, the Open Infrastructure Summit 2020 is here again, and we couldn’t be more excited!

The latest edition of the summit is being held from 19th to 23rd of October. Owing to the pandemic, it will happen virtually – on the cloud, if we may take the liberty to crack that joke. We will definitely miss the face-to-face interaction and the collective energy of being together at one location, but there are many things lined up at the Summit to be thrilled about.

Team VEXXHOST has even more reason to look forward because we’re a headline sponsor for the summit this year. Also, our CEO, Mohammed Naser, would be delivering a Keynote address. That’s double the joy for us, and we hope to see you there as well!

Before we get into the details of what else is coming up at this year’s event, let’s have a short trip down the memory lane and look at past editions of the summit.

Boston OpenStack Summit 2017

It was the first time the summit ran under the Forum format. Attendees numbering in thousands flocked to the city of the Red Sox in May 2017 to connect with the OpenStack community – learn more, pool their resources, and advance their growth.

The interop challenge was one of the highlights that time, cementing the fact that OpenStack and its open standards work and coexist well. Another notable highlight was the introduction of a remotely managed private cloud offering by the foundation’ something that could be seen almost as “Private-Cloud-as-a-service.”

You could read our official recap of the summit here.

Vancouver OpenStack Summit 2018

The Vancouver Summit had over 500 sessions and was attended by over 2,600 members of the OpenStack community.

This edition of the summit had us in the role of a sponsor and steady contributor and donor to the open source infrastructure. Our CEO Mohammed Naser presented an OpenLab Demo on the first day and later held a workshop on how to install and configure your first jobs with Zuul.

Find more details about that edition of the summit here.

Berlin OpenStack Summit 2018

The open source fraternity really came together for the Berlin Summit held in November 2018. Attendees from across the world participated in sessions and workshops on Public Cloud, Private Cloud, Container Infrastructure, CI/CD, etc.

At the Berlin summit, Team VEXXHOST had the opportunity to announce new enterprise-grade GPU instances for our public, private, and hybrid clouds. The CEO, Mohammed Naser, also was a part of the Keynotes in the first two days.

The recap of the Berlin edition can be found here.

Denver Open Infrastructure Summit 2019

The Denver edition of the summit was the first one with the new name ‘Open Infrastructure Summit.’ Team VEXXHOST was one of the proud sponsors of the event and announced our new Kubernetes enablement offering and many others.

The summit was even more special for VEXXHOST because we were nominated for and won the Super User Award, thanks to the OpenStack community.

Read all the juicy details here.

Shanghai Open Infrastructure Summit 2019

In November 2019, Shanghai saw OpenStack enthusiasts flock together to learn, experience, and share the magical new developments and updates in the open source community. We were proud to hand over the Super User award to the new winner and our CEO Mohammed Naser held several discussions and presentations over the course of three days.

If you would like to know more details on the summit, they’re here.

Now that we’ve seen what’s happened in the past let’s see what’s lined up for the virtual Open Infrastructure Summit 2020.

Virtual Open Infrastructure Summit 2020

The pandemic year comes with many challenges, but the open source community is ready as ever to make the first virtual summit into a success. IT decision-makers, open source developers, and operators numbering in thousands, representing a staggering 750 companies and 110 countries, will attend the four-day event.

The summit promises to be a one of a kind even where enthusiasts from various open source communities such as Ansible, Ceph, Kubernetes, Airship, Docker, ONAP, Kata Containers, OpenStack, Open vSwitch, Zuul, StarlingX, OPNFV, and much more.

The various keynotes, presentations, forums, sessions, and workshops will be on diverse topics such as:

  • Container Infrastructure
  • 5G
  • CI/CD
  • NFV & Edge
  • Public, Private & Hybrid Clouds,
  • Security,
  • AI, Machine Learning, HPC

Team VEXXHOST is All Set for Summit 2020

We mentioned it before but would like to do it again because we’re so proud of it – VEXXHOST will be a headline sponsor at this year’s summit!

Our CEO, Mohammed Naser, will deliver a keynote presentation on Tuesday, October 20th, between 10 AM and noon CT.

You can find us at our virtual booth during the event, and we’ll be serving up a variety of exciting announcements about our public and private clouds and never before offers for our visitors! Make sure to stop by and say hello.

We would also like to let you know that many exciting announcements are coming your way from us at the summit. So, stay tuned!

To find more information about the summit, head on to the official Open Infrastructure Summit 2020 page. If you’re already excited and would like to register for the event outright (it’s FREE!), the place is here.

We look forward to seeing you at the summit. More updates to come soon!

Like what you’re reading?
Deep dive into a hands-on ebook about how you can build a successful infrastructure from the ground up!

The post Join Us at the Open Infrastructure Summit 2020 and Here’s Why appeared first on VEXXHOST.

by Athul Domichen at October 07, 2020 08:46 PM

OpenStack Superuser

#OpenInfraSummit Track: Container Infrastructure

The Open Infrastructure Summit, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, and security. Thousands of attendees are expected to participate, representing 30+ open source communities and more than 110 countries.

Today, we are featuring one of the seven Summit tracks—Container Infrastructure. Get your Summit tickets for free and don’t forget to add these sessions to your Summit calendar!

Building Containers is Fun. Let’s See how?

  • Presented by Arun Chaudhary from Oracle.
  • What can you expect to learn?
    • A brief intro to containers
    • Container runtime
    • Linux Capabilities
      • Adding Capabilities
      • Dropping capabilities
      • Figuring out what’s needed
    • Cgroups in brief
      • some test around it
    • Namespaces in brief
      • Possible problems
      • Short term solution
      • long term solution
    • Explore user namespace
  • Add this session to your Summit calendar!

Building High Efficient Storage Infrastructure for Secure Container on Top of SPDK

  • Presented by Changpeng Liu & Xiaodong Liu from Intel.
  • What can you expect to learn?
    • Current typical container storage infrastructure
    • How storage devices are virtualized for secure container
    • How vhost-user storage devices are recognized by container runtime through OCI runtime spec
    • SPDK libraries and modules to consist a userspace container storage infrastructure
  • Add this session to your Summit calendar!

Building Kubernetes Operators with the Operator Framework and Ansible

  • Presented by Keith Tenzer from Red Hat.
  • What can you expect to learn?
    • Attendees will leave with an understanding of the value of Operators, how to build their own Operators with the Operator Framework and share them with the community via OperatorHub.
  • Add this session to your Summit calendar!

Connecting Ecosystems: How Cinder CSI, Ember CSI and Manila CSI Leverage OpenStack bits in Kubernetes

  • Presented by Christian Schwede, Gorka Eguileor & Tom Barron from Red Hat.
  • What can you expect to learn?
    • Basic concepts of the Container Storage Interface
    • Provided services and differences for the different CSI projects:
      • Cinder CSI
      • Ember CSI
      • Manila CSI
    • How operators are used to simplify deployments of these drivers
  • Add this session to your Summit calendar!

Cloud-hypervisor: A New Choice for Virtual Machine Monitor

  • Presented by Henry Wang & Michael Zhao from Arm.
  • What can you expect to learn?
    • How did Cloud-hypervisor came around. Provides some background regarding Rust and Rust-VMM.
    • Use cases analysis, e.x. using Cloud-hypervisor as runtime of Kata-Containers.
    • Architecture introduction.
    • Demonstration. A live demo on how to run VM’s with different feature/devices configuration.
  • Add this session to your Summit calendar!

Declarative Chain to Kubernetes Multi Clusters for Automation of HA Workloads.

  • Presented by Alex Barchiesi from CERN, Matteo Di Fazio & Marco Lorini fromConsortium GARR.
  • What can you expect to learn?
    • From A to Z of the Kubernetes cluster federation on top of a declarative setup of a multi-region OpenStack infrastructure.
    • We’ll cover the MAAS and juju layer to achieve OpenStack in a declarative way and then re-cycle juju with a different backend (OpenStack itself) to have a multiple Kubernetes setups that will be federated through KubeFed so to have the possibility to move the workload from a region to another.
  • Add this session to your Summit calendar!

It’s a Multi-Mesh World

  • Presented by Lee Calcote from SolarWinds.
  • What can you expect to learn?
    • Many audience members are challenged in their attempts to understand the details of working with different service meshes and the challenges they bring.
    • Using Meshery, demonstrations will be done in the context of the Service Mesh Interface, Hamlet, and the Service Mesh Performance Specification,  which are projects that most people do not have an understanding about and will benefit from, during this session.

Kata * TEE = A Lego-like Two-way Sandbox for Seamless Security and Privacy

  • Presented by Kailun Qin from Ant Group.
  • What can you expect to learn?
    • What are the container attack vectors at the Cloud and Edge? How about their implications over security and privacy? Where and why a two-way sandbox is needed?
    • What is a Trusted Execution Environment (TEE)? How it helps with protecting sensitive code and data in use?
    • What Kata Containers is and what it offers to boost security in containers at present?
    • How to leverage Kata with TEE technologies to build up your own two-way sandbox in a lego-like way? And what adaptions are needed for the relevant opensource projects like Kata Containers, rust-vmm etc.?
    • How the end-to-end ease and seamlessness of usage is achieved on orchestration platforms such as OpenStack and Kubernetes?
    • A PoC built on top of Kata, which showcases the seamless two-way isolation user experience and the lego-like developer experience.
    • An outline of the current status and next steps for upstream.
    • Practice experience within Ant Group.

Observability in Kata containers 2.0

  • Presented by Bin Liu from OneAPM.
  • What can you expect to learn?
    • What’s Observability and why Observability is important
    • Observability technical in Kata containers 2.0
    • How to use Observability in Kata containers 2.0
  • Add this session to your Summit calendar!

Run your Kubernetes Cluster on OpenStack in Production

  • Presented by Anita Tragler, Franck Baudin & Ramon Acedo Rodriguez from Red Hat.
  • What can you expect to learn?
    • Review a production ready  deployment reference architecture for Kubernetes on OpenStack.
    • Understand the deployment and networking challenges to provide OpenStack stable API, services and still support bare-metal performance for Kubernetes orchestrated container workloads.
    • Learn how OpenStack provides a stable and robust convergence platform for bare-metal, VMs and container applications.
  • Add this session to your Summit calendar!

Own Your YAML: extending Kustomize via Plugins

  • Presented by Matt McEuen from AT&T.
  • What can you expect to learn?
    • What Kustomize is, and when you might use it
    • Different plugin types, and how you can develop them
    • A transformer plugin example:  Airship’s ReplacementTransformer
    • A generator plugin example: Airship’s HostGenerator
    • How Airship integrates with Kustomize to manage configuration at scale
    • A demo will be incorporated
  • Add this session to your Summit calendar!

Significance of Hardware Classification Combined with Host Configuration Operator

  • Presented by Digambar Patil from Calsoft Inc, John Williams from Dell EMC Inc., Sirisha Gopigiri.
  • What can you expect to learn?
    • Attendees will learn how to use hardware classification controller and host configuration operator to classify and configure nodes and clusters. Being an intermediate and technical presentation, this would be good opportunity for devops, day-1 and day-2 operations learning. This also shows how baremetal operators combined with Ironic provide inputs for CRD’s. The host classification part adds labels to BaremetalHost CR’s and host configuration operator will use these labels for configuring selected hosts and extending the operation.
  • Add this session to your Summit calendar!

The Best of Both Worlds: Running Highly Efficient Containers Inside High Performance VMs

  • Presented by Erez Cohen & Itay Ozery from Mellanox Technologies.
  • What can you expect to learn?
    • In a virtualized platform, the entity that dispatches packets between the network and the VMs (and between local VMs) is the virtual edge bridge (VEB) such as Open Virtual Switch (OVS), Linux Bridge, etc. Usually theses switches are implemented in software running in a hypervisor. Due to a software datapath, the VEB performance is either limited or consumes significant amount of CPU cores and cycles in order to perform reasonably well.
    • With virtualized container deployments, this performance degradation problem gets compounded even further due to an additional switching layer. Thus,  although virtualized containers offer great benefits they suffer with severe network performance degradation.
    • The presenters propose mechanisms to accelerate packet switching and processing and thus turbo boost networking performance of containers within VMs.

The Practice and Landing of Kata Containers in Ant Group and Alibaba Group

  • Presented by Fupan Li from Ant Group, Wei Yang from Alibaba Cloud.
  • What can you expect to learn?
    • In this presentation, the presenters would show the details and the scale of how Kata Containers were used in Ant and Alibaba Group, what’re the features we had enhanced for Kata Containers and the optimizations that they had done for Kata.
  • Add this session to your Summit calendar!

Toward Next Generation Container Image

  • Presented by Yan Song from Alipay.
  • What can you expect to learn?
    • Container image basics, drawbacks and improvements. Kata Containers isolation and multi-tenancy.
  • Add this session to your Summit calendar!

Don’t miss these Summit sessions and get your Summit ticket for free!

Still on the fence?

View the full Summit schedule and check out some most anticipated Summit sessions that you might love!

Participate:

Follow the #OpenInfraSummit hashtag on Twitter, Facebook, LinkedIn and make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive behind-the-scenes content on how the Summit is being organized!

Participate in the conversation on

Twittertwitter.com/OpenStack

Facebookfacebook.com/OpenStack

YouTubeyoutube.com/user/OpenStackFoundation

WeChat ID: OpenStack

The post #OpenInfraSummit Track: Container Infrastructure appeared first on Superuser.

by Superuser at October 07, 2020 01:00 PM

October 06, 2020

OpenStack Blog

10 Years of OpenStack – Hieu Le-Quang at Viettel Group/Vietnam OpenInfra User Group

Happy 10 years of OpenStack! Millions of cores, 100,000 community members, 10 years of you. Storytelling is one of the most powerful means to influence, teach, and inspire the people around us. To celebrate OpenStack’s 10th anniversary, we are spotlighting stories from the individuals in various roles from the community who have helped to make... Read more »

by Sunny at October 06, 2020 03:00 PM

October 05, 2020

OpenStack Superuser

#OpenInfraSummit Track: CI/CD

The Open Infrastructure Summit, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, and security. Thousands of attendees are expected to participate, representing 30+ open source communities and more than 110 countries.

Today, we are featuring one of the seven Summit tracks—CI/CD. Get your Summit tickets for free and don’t forget to add these sessions to your Summit calendar!

Auto-scaling Jenkins Cluster on OpenStack Cloud Platform

  • Presented by Cong Ha Minh & Chien Pham Tuong from Viettel Group
  • What can you expect to learn?
    • Fundamental of Jenkins cluster architecture, how masters and slaves works
    • Jenkins monitoring and Openstack Cloud plugin support feature and how it helps
    • Defining Jenkins resource template, strategy on auto-scalin
  • Add this session to your Summit calendar!

CI-CT-CD, Devops Loop for Future Telco Cloud

  • Presented by Fu Qiao, China Mobile
  • What can you expect to learn?
    • Learn about the usecase of CI/CD in Telco, and how the technology in CI/CD should evolve to meet the requirement of such scenario
  • Add this session to your Summit calendar!

CI and CD as a Service: A Multi-vendor Driven System

  • Presented by Abhijeet Singh, Avinash Raghavendra & Sharath Rao from AT&T
  • What can you expect to learn?
    • Challenges faced by solution designers in a multi-vendor environment
    • Use cases for CI and CD
    • Example setup with community tools to implement the presented approach
  • Add this session to your Summit calendar!

Combining Ansible and Terraform for CI – Better Together Love Story Based on OVN-CI Project

  • Presented by Arie Bregman & Szymon Datko from Red Hat,
  • What can you expect to learn?
    • The attendees will learn what are Ansible and Terraform – what these tools are capable of and how to use them in a series of practical examples. Then the presenters will present how these tools can be used together in a single workflow, leading to a complete CI scenario. They will demonstrate helpful tricks and useful strategies in such usage.
  • Add this session to your Summit calendar!

Getting Started with Zuul

  • Presented by James Blair from Red Hat
  • What can you expect to learn?
    • Attendees will learn how to set up a Zuul system and how to start creating pipelines and jobs for testing and deployment.

Implementing a CI/CD Platform on OpenStack that Doesn’t Hurt

  • Presented by Mark Maglana from Canonical
  • What can you expect to learn?
    • A comprehensive overview of what it takes to build a CI/CD platform that containerized app developers will actually like to use; Tips on how to make their (the CI/CD platform developer’s) workflow easier; some wisdom from Jez Humbles’ books put into context.
  • Add this session to your Summit calendar!

User Report of a Zuul Installation in Production

  • Presented by Artem Goncharov & Nils Magnus from T-Systems International GmbH
  • What can you expect to learn?
    • A brief overview of Zuul features.
    • How to install, configure, and avoid pitfalls.
    • Judge how Zuul fares in an enterprise environment
  • Add this session to your Summit calendar!

Zuul at Volvo Cars Corporation

  • Presented by Albin Vass & Johannes Foufas from Volvo Cars
  • What can you expect to learn?
    • Volvo Cars Corporation will present how they successfully use ZUUL CI for a range of different Software components.
  • Add this session to your Summit calendar!

Don’t miss these Summit sessions and get your Summit ticket for free!

Still on the fence?

View the full Summit schedule and check out some most anticipated Summit sessions that you might love!

Participate:

Follow the #OpenInfraSummit hashtag on Twitter, Facebook, LinkedIn and make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive behind-the-scenes content on how the Summit is being organized!

Participate in the conversation on

Twittertwitter.com/OpenStack

Facebookfacebook.com/OpenStack

YouTubeyoutube.com/user/OpenStackFoundation

WeChat ID: OpenStack

The post #OpenInfraSummit Track: CI/CD appeared first on Superuser.

by Superuser at October 05, 2020 01:00 PM

October 01, 2020

OpenStack Superuser

OpenStack 10 Years: A Revolution for Telco

On celebrating the 10 years’ birthday for OpenStack, we would like to share the revolution and transformation OpenStack has made for telco operators.

As the key technology for NFV (Network Function Virtualization), OpenStack is now deployed in more than 60,000 physical servers in China Mobile’s network clouds. This technology helps telco operators enjoy the benefit of low cost, high efficiency and agility. Since 2019, China Mobile launched its telco-cloud construction, which is also the world’s largest NFV project so far, with tens of thousands of servers in total, located in eight core regions across the country. More than 20 hardware and software suppliers are included in this telco cloud. Scalability and multi-vendor nature are the major challenges we have for our network cloud.

Telco operators used to build up network following the standards. However, when it comes to IT cloud, there is no specific standard to follow but lots of open-source technologies and best practice experience. By working with the OpenStack community, China Mobile adopts these technologies and experience, making full use of them to build up the network cloud. Over the years, OpenStack has become the de-facto standard for China Mobile’s network cloud infrastructure. Based on this de-facto standard, we also build up our automation testing platform, named “AUTO”, and cross-vendor CI/CD pipeline. AUTO makes full use of OpenStack SDK and existing testing tools to provide overall check and verification for our network cloud, especially on the scalability and performance.

We also build up a cross-vendor CI/CD pipeline, to provide iterate testing on multi-vendor’s cloud. Our CI/CD pipelines connect with Vendor Labs, including Ericsson, Huawei, H3C and ZTE. Any version updates from vendors will invoke automatic deployment and system testing in China Mobile’s lab and will feedback to vendor products. With the help of CI/CD pipeline, we are able to continuously deploy and test vendor OpenStack for more than 10 times per week, and each round only costs less than five hours. Such cross-vendor CI/CD helps our cloud to iterate fast and precisely.

When deploying such large scale cloud, we also realize it is important to improve the efficiency and quality of hardware integration, since any hardware defects will eventually influence software and will cause difficulties in sourcing errors. Therefore we also evolve the capability of AUTO to provide hardware configuration and test. So far, AUTO has been used in all the eight regions across the country and helps to reduce our construction time by 1/3. It only takes 20 minutes to configure all the devices using AUTO and 80 minutes for AUTO to finish testing on single resource pool of more than 1000 physical nodes. Based on AUTO, we accomplish full scale quality check. So far, more than 15,000 issues are found and solved. AUTO also helps us to successfully decrease the configuration fault rate from 30% to zero.

Our team has benefited a lot from open source in the past 10 years. All the automation magic can never happen without the open source nature of OpenStack and the de-facto standard interfaces it provides to NFV. We are also active contributors for OpenStack since 2013. So far, we have given one keynote and nine sessions in the OpenStack Summits and Open Infrastructure Summits, and have served in the programming committee for six times. We share and contribute our successful CI/CD and hardware automation experience to the community through these sessions and actively contributed to telco and Edge Computing WG. Besides, China Mobile is also one of the founding members and active contributors of OPNFV and CNTT. We engaged in building up the joint effort of these open source communities to conquer the integration challenges of the telco cloud.

On the 10 years’ birthday of OpenStack, we would like to thank this community for continuously being active and productive and evolving so fast to fulfill the requirement of telcos in such a short time. As telco operators, we see OpenStack as the key for multi-vendor cloud. We would like to continuously contribute and support this community, to make sure a sustainable open-source community is always behind to act as the de-facto standard for our cloud.

The post OpenStack 10 Years: A Revolution for Telco appeared first on Superuser.

by Fu Qiao at October 01, 2020 08:00 AM

September 30, 2020

OpenStack Superuser

#OpenInfraSummit Track: 5G, NFV & Edge

The Open Infrastructure Summit, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, and security. Thousands of attendees are expected to participate, representing 30+ open source communities and more than 110 countries.

Today, we are featuring one of the seven Summit tracks—5G, NFV & Edge. Get your Summit tickets for free and don’t forget to add these sessions to your Summit calendar!

China Tower’s CDN Edge Computing Solution Based on Intel Servers and StarlingX

  • Presented by Jianqing Jiang from 99Cloud Inc, Junyu He from Intel and Liang Zhang from Shanghai Tower.
  • What can you expect to learn?
    • Explore the StarlingX Architecture and CDN technology with community members from China Tower, Intel and 99cloud.
  • Add this session to your Summit calendar!

Deploying Container Network Functions (CNF) on Kubernetes

  • Presented by Ajay Kalambur & Yichen Wang from Cisco System.
  • What can you expect to learn?
    • The audience would be able to walk away with ideas on what it takes to deploy a CNF which has high throughput and low latency demands on top of Kubernetes.
  • Add this session to your Summit calendar!

DIMINET: On Demand Multi-site Connectivity for Edge Infrastructures

  • Presented by David Espinel from Orange.
  • What can you expect to learn?
    • The talk will be mainly divided into two major parts:
      • First,  the presenters will discuss challenges related to the distributed control plane scenario, in particular, the scalability and the importance of sites’ autonomy.
      • Second, the presenters will present DIMINET and explain how it can be used in order to provide connectivity in a distributed manner for several OpenStack instances.
  • Add this session to your Summit calendar!

Hardware Automation Deployment and Test with Big Scale in ChinaMobile NFV Cloud

  • Presented by Xiaoguang Zhang & Zhiqiang Yu from China Mobile.
  • What can you expect to learn?
    • Learn about automation technology, server and switch technology, and network infrastructure
  • Add this session to your Summit calendar!

How does the Telco Industry Benefit from the CNTT/OPNFV Outcomes?

  • Presented by Sukhdev Kapur from Juniper Networks, Walter Kozlowski from Telstra Corporation Australia, Karine Sevilla from Orange, Tomas Fredberg from Ericsson.
  • What can you expect to learn?
    • This presentation delves into the real-world technology challenges, opportunities and business needs we expect cloud infrastructure implementers to encounter in conforming to the CNTT Reference Architectures. This will cause CNTT Cloud Infrastructure solutions to dynamically evolve and the CNTT Reference Model needs to adapt to this fast-paced evolution. The direction of the evolution is dictated by the adoption of the Cloud Native Network Functions that require support from a Container as a Service virtualization layer (CaaS) and the need for the co-existence of several virtualisation technologies requiring separate and yet co-ordinated management.
    • The growing number of real-life deployments, especially in areas like 5G, IoT and Edge,  multiplies number of variations in mostly proprietary solutions. Therefore, there is a strong community interest in the normalization of such deployments using directions from a stable and yet flexible CNTT Reference Model.
  • Add this session to your Summit calendar!

I Deployed my Snowflake and it Didn’t Melt! Now What?

  • Presented by Gianpietro Lavado from Whitestack, Mark Beierl from Canonical.
  • What can you expect to learn?
    • How to break down a complex network service into manageable components and create descriptors to automate the management and orchestration.
    • How to monitor complex network services
  • Add this session to your Summit calendar!

Kubernetes as a Container Platform for the Edge

  • Presented by Karim Manaouil, Adrien Lebre from IMT Atlantique.
  • What can you expect to learn?
    • The behavior of vanilla Kubernetes on the edge, namely the effect of latency and packets loss on Pod startup times, API requests durations and service discovery.
    • Draw conclusions and learn the possible limitations and pitfalls of deploying Kubernetes on the Edge from the previous experiments.
    • An overview of the main initiatives that propose to extend Kubernetes for the Edge and their architectural limitations, namely KubeEdge, Kubefed and Submariner.
  • Add this session to your Summit calendar!

Living on the Edge with DCN Networking

  • Presented by Anita Tragler & Bernard Cafarelli from Red Hat Inc.
  • What can you expect to learn?
    • Review of key Edge DCN Networking reference architectures for Telco, IoT, retail Edeg
    • Understand the differentiated needs and challenges of DCN Edge workloads requiring
      • localized services using DVR and availability zones (DHCP, routing, load balancing)
      • Sophisticated networking with L3 networks (routed provider networks), low latency constraints, QoS and NIC partitioning with bandwidth guarantees
      • Specialized scheduling and live migration with Nova availability zones
  • Add this session to your Summit calendar

Making a Better World by Edge Computing Infrastructure: the Experience and Thinking from FiberHome

  • Presented by Wang Hao & Linxiang Chen from Fiberhome Telecommunication Technologies Co.
  • What can you expect to learn?
    • The presenters will share some experiences about the issues they solved at Fiberhome and also some thinkings about the future of edge computing infrastructure.
  • Add this session to your Summit calendar!

Minimum Viable Core for 4G and 5G Dual Connectivity

  • Presented by Prakash Ramchandran from Dell, Jonathan Bryce from OpenStack Foundation, Boris Renski from FreedomFi.
  • What can you expect to learn?
    • All Service Providers, Chip manufacturers, System, Service & Network Vendors have been facing declining ARPU, funds shortage due to Covid-19 & eager to collaborate to cut costs. With 3GPP (Rel 15/16) 5G NR specs approved by ITU IMT-2020, what needs is a reference implementation of 5GC (core) to migrate using New Radio NR through NSA and SA architectures. The Panel will provide you answers to insights they have gained through the efforts in deploying towards a future plug fest to test both Hardware and Software functionalities to deploy and maintain an upstream version of packages and sources with BSD 3.0 license to contribute and use the Source codes.
  • Add this session to your Summit calendar!

NTT and KDDI Challenges for Sustainable Infrastructure Transformation

  • Presented by Hiroshi Tsuji from KDDI Corporation, Toshiaki Takahashi from NEC Corporation, Yasufumi Ogawa from NTT.
  • What can you expect to learn?
    • For telecom operators who have been adopting virtualized platforms such as NTT and KDDI, it is a major challenge to achieve sustainable platform transformation and reduce integration and verification costs. Attendees can learn how Tacker is continually responding to new technologies and strengthening the dev process to tackle the challenge. In collaboration with ETSI NFV, Tacker is creating dev process to automate IF and spec compliance test and will promote solution of CNF control.

OSF Edge Computing WG: Defining the Undefinable

  • Presented by Beth Cohen from Verizon, Gergely Csatari from Nokia Networks, Ildiko Vancsa from OpenStack Foundation, David Paterson from Dell EMC.
  • What can you expect to learn?
    • Attendees will learn just how much the Edge Computing Working Group’s projects have advanced over the past year as the work has expanded to other projects within both the OpenStack and the wider Open Source communities.  This is an opportunity to learn how you can participate in driving innovation in this exciting new superset of the cloud computing technology, as the working group continues its mission to provide requirements to help vendors and Open Source projects create tools and features in support of the use cases.
  • Add this session to your Summit calendar!

Time-Sensitive Networking (TSN) Enabling on StarlingX

  • Presented by Yi Wang from Intel.
  • What can you expect to learn?
    • The basic concept of time-sensitive networking (TSN). What TSN is, and what TSN can benefit
    • Some experience on how to enable TSN capability for workload
    • How to measure TSN network performance
  • Add this session to your Summit calendar!

Updating Firmware Using Ironic and Redfish in the Datacenter and at the Edge

  • Presented by Chris Dearborn & David Paterson from Dell EMC.
  • What can you expect to learn?
    • Attendees will learn:
      • To use Ironic to remotely update firmware on servers using the Redfish driver.
      • Determine if the update was successful or not, and how to troubleshoot issues should they occur.
      • How to update firmware for edge nodes deployed in remote networks
  • Add this session to your Summit calendar!

What CNTT Thinks About Containers?

  • Presented by Georg Kunz from Ericsson, Gergely Csatari from Nokia Networks.
  • What can you expect to learn?
    • Attendees will get an overview of the CNTT Kubernetes Based Reference  Architecture (RA2), Reference Implementation (RI2) and Reference  Conformance (RC2) and the related activities in OPNFV.
  • Add this session to your Summit calendar!

Don’t miss these Summit sessions and get your Summit ticket for free!

Still on the fence?

View the full Summit schedule and check out some most anticipated Summit sessions that you might love!

Participate:

Follow the #OpenInfraSummit hashtag on Twitter, Facebook, LinkedIn and make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive behind-the-scenes content on how the Summit is being organized!

Participate in the conversation on

Twittertwitter.com/OpenStack

Facebookfacebook.com/OpenStack

YouTubeyoutube.com/user/OpenStackFoundation

WeChat ID: OpenStack

The post #OpenInfraSummit Track: 5G, NFV & Edge appeared first on Superuser.

by Superuser at September 30, 2020 01:00 PM

Galera Cluster by Codership

Galera Manager Deploys Galera Cluster for MySQL on Amazon Web Services

Today there is NEW ERA for Galera Cluster monitoring and management and as we release Galera Manager 1.0-beta into the wild for everyone to evaluate, test, and deploy their Galera Clusters within an Amazon Web Services (AWS) Elastic Compute Cloud (EC2) environment to achieve MySQL High Availability, Multi-Master MySQL on the cloud and Disaster Recovery, all from the comfort of a web-based graphical user interface (GUI).

What does Galera Manager do? Galera Manager is a deployment, management and monitoring solution for Galera Clusters. A user can easily create clusters, add and remove nodes, and create geo-distributed clusters across multiple AWS regions, all with the click of a few buttons in one’s web browser. Even more useful is the over 620 monitoring metrics available to monitor the health of your clusters. Being fully web-based, you can say goodbye to having to access a console, configure the my.cnf’s individually, and bootstrap a cluster from scratch. Galera Manager brings Galera Cluster for the rest of us, beyond just DBAs!

Galera Manager is the tool to deploy Galera Clusters within an AWS EC2 environment.

You can start with t2.micro instances for all your needs during evaluation, but naturally you are expected to launch more powerful instances for production use cases. Currently, Galera Manager allows you to create a new Galera Cluster with either MySQL 5.7 or MySQL 8.0. It supports the base operating system to be CentOS 7, Ubuntu 18.04 LTS and Ubuntu 20.04 LTS.

It is well documented: The Galera Manager and should be considered the prime place to get information about Galera Manager.

Let’s get started with an install! You will need four (4) t2.micro instances, for when it comes to budgeting, but this is also AWS free-tier eligible. Create an EC2 node with SSH access in any region that you prefer. For simplicity, Amazon Linux 2 was chosen as the base operating system (as it remains largely compatible with CentOS). For ease of use, run this as root (sudo su or the likes will suffice).

Once SSH’ed into your instance, you will need to download the installer:

wget https://galeracluster.com/galera-manager/gm-installer

After that, you should make it executable, by executing: chmod +x gm-installer. Start the installer via: ./gm-installer install.

You will have to accept the End User License Agreement (EULA). Most things can be left at a default, and you can follow on the install documentation, but it is worth noting that when it comes to domains & certificates, if you choose to enter an IP address (quite the default when you’re testing and using AWS), you will get a warning stating: Since you entered an IP address, SSL won't be available. There are some implications for this, i.e. everything going around in plaintext, including your AWS credentials, over HTTP as opposed to HTTPS. However, you will get setup for evaluation a lot faster.

Once all that is done, wait for the installer to complete. This takes a few minutes. You will see a message displayed as such (your IP address or hostname will vary of course):

INFO[0143] Galera Manager installation finished. Enter http://34.228.140.255 in a web browser to access. Please note, you chose to use an unencrypted http protocol, such connections are prone to several types of security issues. Always use only trusted networks when connecting to the service.

You will also note that there are Logs and Metrics database URLs, and you can safely ignore this. You do however need to open up some ports within the firewall within Amazon (you do this via Security Groups). The ports in question are: 80, 443, 8091, 8092 and we have documented it at AWS Ports with Galera Manager. This happens immediately, and it will be in addition to port 22 (SSH) which is already open.

Now you can access your Galera Manager via any web browser. Login with the credentials you provided earlier, and you can get started in creating your first 3-node cluster. Start first by creating a cluster. While you can provide a “Custom DB engine configuration”, I would advise against this as you are trying to get started as quickly as possible. However if you choose to do so, please refer to the guide in Deploying a Cluster in Galera Manager. Pick your AWS region for the initial cluster setup, configure what DB engine you want, what the host OS should be running and get your AWS Access Key ID and AWS Secret Access Key from your AWS console. For more information, do read: Understanding and getting your AWS credentials (however we also do cover it in the above linked documentation piece). It is crucial to provide an SSH authorised key to ensure that you are able to login to the servers that your Galera Clusters are being deployed.

Once your cluster is created, you need to create some nodes within it. Galera Cluster runs best with a minimum of 3-nodes, and it is recommended that you deploy as such. You can opt to automatically start nodes after deployment. You can also choose to segment it here, but generally, the defaults are all you will need to get going. Here you can also choose your instance type (note that this is different from where you installed Galera Manager; the software itself runs fine with a t2.micro, but your needs may vary when it comes to a production cluster as AWS EC2 has multiple instance types that can be suited to your workload). Your region can also be different from where you installed Galera Manager (you might notice that it defaults to Frankfurt). As for the host type, you’ll notice that there is ec2 as the default and locallxd as well; for the purposes of this, you should just leave it to ec2 (though this helps you see where we are headed to from a roadmap standpoint).

Once you click Deploy, you’ll notice that the install starts. You can switch tabs at this point while the deployment phase happens. This can take anywhere between 5-20 minutes. If you are wondering why the timing is so non-deterministic, it has a lot to do with all the packages that are being installed and latency between your instances and the repositories. The installer also wants to ensure that there are no failures, so on a slower instance (sometimes you have a noisy neighbour), there will be a lot more “sleeps” built-into the software to ensure proper execution. The good news is that you will see individual ETAs for your instance deploys. You can also hop on over to see individual logs to see how instance installs are progressing.

Once the deploy is completed, you will be presented with your new managed cluster. You will see tabs for monitoring, logs, configuration (which allows you to find out hostname information so you can use the mysql client to connect to or SSH into the server), as well as jobs that are running.

Congratulations, you now have a 3-node Galera Cluster deployed entirely using the GUI tool, Galera Manager. In future blog posts we will cover adding nodes, removing nodes, dropping into your cluster via SSH, and more.

We are looking for feedback on this beta, so please do email: info@codership.com. We are also constantly going to release updates to this release.

Being a beta, there are several known bugs naturally, some of which include debug logging turned on by default. Toggles will come to improve this as we make the next release. Being web-based software we plan to do a lot more releases and deploys constantly.

by Sakari Keskitalo at September 30, 2020 07:00 AM

September 28, 2020

StackHPC Team Blog

StackHPC Shortlisted for OpenStack SuperUser Award

With the virtual Open Infrastructure Summit just a few weeks away, the SuperUser Award shortlist has been announced, and StackHPC is thrilled to have been selected as a nominee.

SuperUser StackHPC nomination

Since our formation about five years ago, we have followed a vision of the opportunities offered by open infrastructure for scientific and research computing. This nomination is a tremendous validation of our contribution to the open infrastructure community in that time.

Fingers crossed for the winner announcement during the opening virtual keynote!

Get in touch

If you would like to get in touch we would love to hear from you. Reach out to us via Twitter or directly via our contact page.

by Stig Telfer at September 28, 2020 05:00 PM

OpenStack Blog

10 Years of OpenStack – Elizabeth K. Joseph at IBM

Happy 10 years of OpenStack! Millions of cores, 100,000 community members, 10 years of you. Storytelling is one of the most powerful means to influence, teach, and inspire the people around us. To celebrate OpenStack’s 10th anniversary, we are spotlighting stories from the individuals in various roles from the community who have helped to make... Read more »

by Sunny at September 28, 2020 03:00 PM

OpenStack Superuser

#OpenInfraSummit Track: AI / Machine Learning, HPC

The Open Infrastructure Summit, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, and security. Thousands of attendees are expected to participate, representing 30+ open source communities and more than 110 countries.

Today, we are featuring one of the seven Summit tracks—AI / Machine Learning, HPC. Get your Summit tickets for free and don’t forget to add these sessions to your Summit calendar!

Docker for Machine Learning!

Enhancement and Optimization of New Heterogeneous Accelerators Based on Cyborg

  • Presented by Brin Zhang & Wenping Song from Inspur, Shaohe Feng from Intel.
  • What can you expect to learn?
    • Cyborg architecture
    • Nova, Cyborg interaction mechanism
    • Accelerator performance optimization
    • Cyborg’s benefits to the cloud platform
  • Add this session to your Summit calendar!

Federated Digital Research Infrastructures for Health Data Research

  • Presented by John Taylor from StackHPC Ltd.
  • What can you expect to learn?
    • Requirements on data governance in handling sensitive data.
    • What platforms are used for analysis.
    • What computation techniques are being used in exploring data such as HPC, AI, etc.
    • Requirements for federation in order to bring the “Researcher to the Data”.
    • How current OpenStack infrastructures are being used to address these aspects.
  • Add this session to your Summit calendar!

How to split a cake? GPUs with OpenStack

  • Presented by Sylvain Bauza from Red Hat.
  • What can you expect to learn?
    • Operators will know how to use Nova for providing instances with virtual GPUs. They will also know about the current supported features and the next ones.
  • Add this session to your Summit calendar!

Lessons Learnt Building Cambridge University’s CSD3 Supercomputer with OpenStack

  • Presented by John Garbutt from StackHPC, Paul Browne from the University of Cambridge.
  • What can you expect to learn?
    • Find out about how Cambridge University manages around 1000 nodes using OpenStack to deliver a mix of Baremetal, VM and Container based workloads. The presenters will include details of how the full lifecycle of the hardware is automated using OpenStack, instead of xCAT. There is a mix of workloads across AI, HPDA, HPC and HTC all sharing a single pool of diverse hardware. Using OpenStack helps break the hardcoding between infrastructure resources and the science platform used to access those resources, and also eases the transition towards more portal science platforms through the adoption of infrastructure as code tooling.
    • Routine re-imaging of Slurm is integrated with OpenStack. A new networking-generic-switch driver is used to dynamically provision the Cumulus switches.
  • Add this session to your Summit calendar!

Machine Learning at Edge Cloud

  • Presented by Prakash Ramchandran from Dell, Vivek Hariharan from Indeed.
  • What can you expect to learn?
    • Attendees will learn how to set up an optimal environment, both in AWS and with Open-Infra projects for running machine learning models and transmitting inferences with low latency. Application developers will be able to better understand application requirements when integrating machine learning models.
    • Ultimately, understanding the technical requirements to allow for machine learning to be effectively integrated with Edge Cloud infrastructure will help attendees innovate, whether it be with AWS or with Open-Infra projects.
  • Add this session to your Summit calendar!

Standalone yet at scale: Ironic take on High Performance Computing

  • Presented by Timothy Randles from Los Alamos National Laboratory, Julia Kreger & Jacob Anders from Red Hat.
  • What can you expect to learn?
    • In this presentation, the presenters will focus on the unique combination of ease of deployment and powerful capabilities that Ironic standalone brings to High Performance Computing (HPC).
  • Add this session to your Summit calendar!

The Coral Reef Cloud: Autoscaling, Reservation and Preemption

  • Presented by John Garbutt & Steve Brasier & Pierre Riteau from StackHPC.
  • What can you expect to learn?
    • In a fixed capacity cloud, it can be hard to manage a fair share of resources between lots of groups of users. The presenters will explore how Blazar can help with these problems.
    • When you first use quotas, you dedicate resources to specific groups. Soon it becomes clear more people want some quota, and the existing users have not yet used all they asked for. The obvious next step is to hand out quota for more resources than you have in your cloud. Eventually the cloud becomes full, instance builds go into an ERROR state with “no valid host”. Then you find people start leaving VMs running, and using rebuild instead of delete, in case they lose access to their “hard won” slice of resources.
    • The presenters will look at how to manage this resource allocation better through the use of Blazar reservations, and how to avoid under-utilization by using preemptible instances.
  • Add this session to your Summit calendar!

Don’t miss these Summit sessions and get your Summit ticket for free!

Still on the fence?

View the full Summit schedule and check out some most anticipated Summit sessions that you might love!

Participate:

Follow the #OpenInfraSummit hashtag on Twitter, Facebook, LinkedIn and make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive behind-the-scenes content on how the Summit is being organized!

Participate in the conversation on

Twittertwitter.com/OpenStack

Facebookfacebook.com/OpenStack

YouTube: youtube.com/user/OpenStackFoundation

WeChat ID: OpenStack

The post #OpenInfraSummit Track: AI / Machine Learning, HPC appeared first on Superuser.

by Superuser at September 28, 2020 01:00 PM

CERN Tech Blog

Preemptible Instances in production at CERN

Cloud providers need to ensure that they have enough capacity available when users request it. As a result, they need to have spare capacity, unused servers. In other words, they have exactly the same problem that they are trying to solve for their IaaS customers. Amazon Web Services (AWS) was the first public cloud provider to address this challenge. In 2009 AWS released the “Spot Instances” marketplace. The idea is that unused servers capacity can be sold at a massive discount price (up to 90% when compared with On-Demand instances) with the caveat that the instances can be terminated at any time when AWS needs the capacity for On-Demand or Reserved instances.

by CERN (techblog-contact@cern.ch) at September 28, 2020 11:00 AM

September 24, 2020

OpenStack Superuser

Inside Open Infrastructure: The latest from the OpenStack Foundation

Spotlight on:

Open Infrastructure Summit schedule is live!

The schedule features keynotes and sessions from users like Volvo Cars Corporation, Workday, GE Digital, Société Générale, LINE, Ant Group, and more! The event, held virtually for the first time, takes place October 19-23 and includes more than 100 sessions around infrastructure use cases like cloud computing, edge computing, hardware enablement, and security as well as hands on training and an opportunity to interact with vendors in the Open Infrastructure Marketplace.

10,000+ attendees are expected to participate, representing 30+ open source communities and more than 110 countries. Keynotes begin at 10am Central Time on Monday, October 19 and again on Tuesday, October 20.

Sessions for the Summit focus on the most important open infrastructure issues and challenges facing global organizations today:

  • AI / Machine Learning, HPC
  • Container Infrastructure
  • CI/CD
  • 5G
  • NFV and Edge Computing
  • Cloud Computing: Public, Private and Hybrid
  • Security

View the full Summit schedule!

Still on the fence?

Read this Superuser article and check out some most anticipated Summit sessions that you might love!

Make sure to subscribe to the OpenStack Foundation (OSF) YouTube channel to get exclusive content on how the Summit is being organized!

The Superuser Awards nominees are now available for community review! Check out the 8 nominees and rate your favorites.

Get my free Summit ticket!

OpenStack Foundation news

Open Infrastructure Summit, October 19-23, 2020

Project Teams Gathering (PTG), October 26-30, 2020

Airship: Elevate your infrastructure

  • Airship in the news! Read how AT&T is using Airship to deploy all of the operator’s network cloud deployments running their 5G workloads.
  • The Technical Committee is pleased to announce that the Airship 2.0 beta milestone completion is imminent. This milestone included 135 issues, with the last open issues being targeted for completion by the end of September.
  • Updated meeting cadence!
    • Airship SIG UI: design topics will be included during Airship Open Design meetings, and grooming sessions will now occur during the Airship Flight Plan Call. Read the announcement here.
    • Airship Slack/IRC meeting: Change from weekly to bi-weekly. Please see the original announcement here.

Kata Containers: The speed of containers, the security of VMs

  • The community has just tagged the 2.0.0-rc0 Kata-Containers release which is the first release candidate for Kata Containers 2.0. Next, we will focus on fixing bugs and making it stable for the incoming 2.0.0 release. Check out the 2.0.0-rc0 release highlights.
  • We have four candidate submissions for the Architecture Committee election, where we have three seats available. Thus, we shall enter the Q&A and subsequently the voting phases. The current period (September 18th – 27th) is for asking questions to candidates via this mailing list.

OpenStack: Open source software for creating private and public clouds

  • We are in the last stages of preparation for the Victoria release, this week being the deadline for the first release candidates, in preparation for the final release on October 14.
  • Elections for OpenStack leadership for the upcoming Wallaby development cycle (PTLs and TC seats) began September 22nd with the nomination period extending for one week, until September 29th. Polling will take place October 6th through the 13th. For more details on the technical elections, check out the election site here.
  • A new SIG has been proposed to gather people interested in discussions around packaging. The rpm-packaging team will be transitioned into this SIG, however, the Packaging SIG can also include other packagers like Debian or Ubuntu. This patch discusses the creation of the SIG.
  • Discussion around Wallaby release goals has begun! Graham Hayes sent out a call for updates to goals previously suggested and for goal champions to draft them into proposals for acceptance for Wallaby. 
  • Are you looking for OpenStack-related jobs? Set yourself apart from other candidates by taking the Certified OpenStack Administrator (COA) exam. See more details here!

StarlingX: A fully featured cloud for the distributed edge

  • The fall elections are approaching quickly! The community will elect the Project and Technical Leads as well as some of the TSC seats. If you are interested in nominating yourself or follow the process you can find more information on the elections webpage.
  • The StarlingX community is currently in the 5.0 release cycle. The community is currently deciding about the release maintenance periods which are currently planned for 12 months. The community is also planning a maintenance release and targeting October, 2020 with the availability of 3.0.1. For more information see the release wiki.

Zuul: Stop merging broken code

  • Get prepared for the next version of Zuul. The next release will drop support for Python 3.5 and Ansible 2.7. Redeploy your Zuul services on Python 3.6 or newer and migrate any jobs to Ansible 2.8 or 2.9 before your next Zuul Upgrade. See the release notes for more details.

Check out these Open Infrastructure Community Events!

For more information about these events, please contact denise@openstack.org.

Questions / feedback / contribute

This newsletter is written and edited by the OSF staff to highlight open infrastructure communities. We want to hear from you! If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside Open Infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by Sunny Cai at September 24, 2020 08:32 PM

Galera Cluster by Codership

Galera Clustering in MariaDB 10.5 and beyond

Continuing on our series of coverage of what happened at the recent MariaDB Server Fest 2020, now we will focus on the next talk by Seppo Jaakola, titled: Galera Clustering in MariaDB 10.5 and beyond.

A quick overview includes not just information about Galera 4 in MariaDB 10.4, but also the new features available in MariaDB Server 10.5 namely: GTID consistency, cluster error voting, XA transactions within the cluster, non-blocking DDL operations and Black box. There is also a focus on MariaDB Server 10.6 planning.

 

There have been much more wsrep-API changes (which require rolling upgrades) compared to major versions of Galera; currently Galera 4 is the latest major version, and the wsrep-API version that is the latest is 26. Rolling upgrades are key in Galera to ensure that when upgrades happen there is no downtime within the cluster at all. The biggest feature in Galera 4 is the ability to use streaming replication, which helps you execute large (greater than 2GB) and long running transactions. The feature was ready in MariaDB 10.3 but did not quite catch the release train, so it ended up in MariaDB 10.4.

Global Transaction ID (GTID) compatibility and consistency is a new feature implemented by Mario Karuza as there were GTID incompatibilities with Galera and MariaDB. Galera stores GTIDs as <uuid>:<sequence number> whereas MariaDB stores it as: <domain-id>:<node-id>:<sequence number>. Galera Cluster now uses the same domain and node ID, and the software now only stores and shows only the MariaDB format GTID. Galera Cluster can also operate as a secondary slave for a MariaDB primary master server and the GTID coming from the MariaDB master is also preserved in the Galera Cluster (you can find the same GTIDs in the binary log files). You can read more at: Using MariaDB GTIDs with MariaDB Galera Cluster.

Cluster Error Voting (started life as MDEV-17048) is a new feature implemented by Alexey Yurchenko, and it is a protocol for nodes to decide how the cluster will react to problems in replication. When one or several nodes have an issue to apply an incoming transactions (e.g. suspected inconsistency), this new feature helps. In a 5-node cluster, if 2-nodes failed to apply the transaction, they get removed and a DBA can go in to fix what went wrong so that the nodes can rejoin the cluster.

XA Transaction Support is a feature implemented by Daniele Sciaccia and Leandro Pacheco de Sousa and what we want here is for Galera Cluster to operate as a resource manager in an XA infrastructure. There is a transaction coordinator where Galera Cluster can operate as an XA Resource Manager, so that Galera Cluster should be able to prepare or rollback/commit a transaction. XA transactions are supported thanks to the implementation of streaming replication which is a foundation for it. The work has been ready for MariaDB 10.4, it was not accepted into the main tree, and it was likely targeted for MariaDB 10.5 but it still missed the train (since there was other work done by MariaDB Server for XA support and there was a conflict in the work between the two teams). So now it will be in MariaDB Server 10.6.

Non-blocking DDL is a feature implemented by Teemu Ollakka and it is only included in MariaDB Enterprise Server edition. Seppo decided not to go through the details of this feature since it is not in the regular MariaDB Server. This will be available in MariaDB Server eventually, but for now, it is an Enterprise only feature. You might be interested in the documentation for this: Performing Online Schema Changes with Galera Cluster.

 

 

Black Box is a feature implemented by Pekka Lampio, and it is also a MariaDB Enterprise Server edition feature. It allows you to store debug messages in a main memory (shm) ring buffer, and it helps with troubleshooting a crashed server or even in cluster testing.

MariaDB 10.6 planning includes XA transaction support testing and documentation work as well as making it work with SPIDER cluster testing (since the SPIDER storage engine is part of MariaDB and also depends on XA support). This helps MariaDB 10.6 to become a “sharding cluster”. This will also enable extreme write scalability that could come from such a setup.

There is also ongoing work, around the test system improvements (this does not show up for the end user, but extending the test coverage is extremely important for development), dynamic SSL/TLS encryption (so you could change an SSL implementation at runtime) as well as further optimisations to streaming replication.

If you would like to see other new features in MariaDB 10.6 or even in Galera Cluster, do not hesitate to drop us a line to info@codership.com or even our GitHub.

by Sakari Keskitalo at September 24, 2020 09:06 AM

Introduction to MariaDB Galera Cluster at MariaDB Server Fest 2020

Seppo Jaakola, CEO of Codership, makers of Galera Cluster, recently gave a talk titled Introduction to MariaDB Galera Cluster at the MariaDB Server Fest 2020, and it is truly one of the best introductions to the software that is currently available and up-to-date as of September 2020. There is an overview, configuration, feature differences between asynchronous replication as well as the releases and release cycles that are available.

As Seppo says in the talk, we work closely with MariaDB to ensure consistent support and services around Galera Cluster, which is also why we have a good strong partnership, from an engineering to services standpoint.

Galera is a replication plugin and in the configuration you need to configure my.cnf to where the Galera plugin resides. This is wsrep_provider and you can also have wsrep_provider_options. You start by having one node, and then after that you start another node, by telling the other cluster addresses via wsrep_cluster_address. In principle, all nodes have the wsrep_cluster_address as a full list (this is a good idea to ensure that your my.cnf is synced with each other). You also naturally need an State Snapshot Transfer (SST) method, via wsrep_sst_method (which you can do such as: wsrep_sst_method=rsync) – this is how the new joining nodes get a copy of the entire database. In this case the joiner node gets an rsync copy of all the data once the handshake occurs, you get a fresh copy of the database, and once the copy is completed, the server pairs and will operate as a node in the synchronous database cluster.

Two nodes are not naturally very functional for a Galera Cluster, as we prefer 3-nodes as a minimum recommend cluster size. Three nodes are the minimum number of nodes when it comes to effective voting. Every node in a cluster is a full backup of each other due to synchronous replication. All writes in the cluster are replicated; all nodes can be used for reading.

Galera Cluster is based on a generic replication plugin for database servers, and uses the replication API to interact with the Database Management Server (DBMS) via the wsrep API (project is open source on GitHub). MariaDB 10.1 and later have had built-in Galera Cluster which makes it easy to get started. Seppo goes through the complete list of wsrep configuration options, but in practice you need very few to get going.

MariaDB 10.4 have 66 wsrep-specific status variables, and when it comes to monitoring, the most common variables are: wsrep_ready, wsrep_cluster_status and wsrep_cluster_size. Besides configuration and status variables, there are also 3 tables for Galera, located in the mysql schema, and they are wsrep_cluster, wsrep_cluster_members (active members of the cluster) and wsrep_streaming_log (a new feature in Galera 4, and in MariaDB 10.4, streaming replication is a method for replicating very big/long transactions).

With streaming replication, transactions are processing in small chunks in each node in the cluster. Status is stored and kept in the mysql.wsrep_streaming_log table. In case of issues, state information is kept here when it comes to long-running transactions. It can also be used for monitoring of transactions and how long they have been processing. This is a good method for troubleshooting ill-behaving transactions.

Some interesting features that Galera Cluster brings: synchronous replication (close to fully synchronous, but the commit is currently only done in one node in the cluster; so it is the look & feel of synchronous replication – there is wsrep_sync_wait to help with this). Flow control keeps node progress even and all nodes are equal, allowing read and write access. Conflicting writes are supported (hence multi-master use is possible – Galera picks which row to commit to, and can ensure rollback for the conflicting write with a deadlock error, and retry the transaction based on configuration). There is obviously also automatic node joining.

What is different with asynchronous replication (standard in MariaDB) and synchronous replication (Galera replication)? Basically, there are several simple reasons for this:

  1. Asynchronous replication allows writes only to the master (primary) server and you end up with a master-slave (primary-secondary) topology.
  2. Secondary nodes may fall behind and encounter secondary lag as there is also no flow control.
  3. Changing the primary server will require failover management which is a burden one has to manage via external software.

 

Galera development started in 2008 and MariaDB support came in 5.5 and 10.0 versions. MariaDB 10.1 includes Galera Cluster in the main branch (so one download is all you need that comes with the relevant libraries). Feature wise, Galera replication has major version numbers: 1, 2, 3, and 4, and the current major version 4 has been present in MariaDB Server 10.4 and greater. Changes to the wsrep API also mean that you will need to perform a rolling upgrade (e.g. if you were migrating from MariaDB Server 10.3 to MariaDB Server 10.4). It is important to note that Galera can be upgraded, but rolling downgrades are not supported (i.e. if you decided to go from MariaDB Server 10.4 with Galera 4 to MariaDB Server 10.3 with Galera 3).

For MariaDB Server 10.6, there may be some major changes that go into Galera and this is something that will be decided quite soon within the next 1-2 months.

Galera Cluster also works in WAN and cloud installations, and MariaDB 10.4 has significant new Galera 4 features as stated in this blog post, so it is highly recommended that you use MariaDB 10.4 and later for evaluation.

Watch out for our next post about the other session that Seppo gave that focuses on what is around in MariaDB Server 10.5 and what is coming in MariaDB Server 10.6.

by Sakari Keskitalo at September 24, 2020 08:56 AM

September 22, 2020

OpenStack Superuser

Meet the 2020 Superuser Awards nominees

Who do you think should win the Superuser Award for the 2020 Open Infrastructure Summit?

When evaluating the nominees for the Superuser Award, take into account the unique nature of use case(s), as well as integrations and applications of open infrastructure by each particular team. Rate the nominees before September 28 at 11:59 p.m. Pacific Daylight Time.

Check out highlights from the eight nominees and click on the links for the full applications:

  • Adobe Platform Infrastructure Team ranked 14th among the largest corporate open source contributors (per Github data), in May 2019. They make concerted efforts to free their employees to participate in open source development and streamline the approval process for employees to contribute code. Adobe is committed to open infrastructure and has been actively involved in related communities, including OpenStack since 2013 and Kubernetes since 2019. Adobe IT OpenStack has five clusters spread across three locations in North America and Asia. Of these clusters, three are production. Over the last five years, it grew 1,000% and presently hosts 13,000+ VMs on 500+ physical hypervisors. The underlying Ceph infrastructure of 3.5 PB actively serves 200,000+ IOPS on a regular basis. Besides OpenStack, their Hadoop and Kubernetes implementations grew exponentially in the last few years and now account for thousands of nodes.
  • China Mobile Network Cloud Integration Team‘s network cloud includes more than 60,000 physical servers, 1,440,000 cores so far, all based on OpenStack and KVM. These servers are distributed in eight regions across the country and support core network services of more than 800 million users across China. Their self-developed AUTO platform has been used in every region across the country. So far, they have tested 68 resource pools, covering more than 60,000 servers, 11,000 switches, and more than 500,000 network connections in CI/CD manner. The CI/CD pipeline in China Mobile Network Cloud Integration Team‘s lab is based on Jenkins and other open source CI/CD tools. It now supports continuous deployment and test iteration for four vendors, covering test cases of more than 500 each time.
  • Leboncoin started using Zuul for open source CI two years ago with Zuulv2 and Jenkins. In the beginning, they only used Gerrit and Jenkins, but as new developers joined Leboncoin each new day, this solution was not enough. After some research and a proof-of-concept, they gave Zuul a try, running between Gerrit and Jenkins. In less than a month (and without official thick documentation) they’ve setup a complete new stack. They ran it for a year before moving to Zuulv3. In terms of compute resources, they currently have 480 cores, 1.3To Ram and 80To in their Ceph clusters available. In terms of jobs, they ran around 60,000 jobs per month which means around 2,500 jobs per day. Jobs average time is less than five minutes.
  • LINE uses OpenStack to do 80% of their new instance creation. Their 50,000+ physical servers, including baremetal servers and hypervisors, across four regions and 67,000+ VM instances, give them the capability to reach over 180 million users while decreasing operational costs and decreasing delivery time from weeks to minutes.
  • SK Telecom 5GX Labs, on top of contributing upstream to OpenStack and Airship, an open source project supported by OSF, developed a containerized OpenStack on Kubernetes solution called SKT All Container Orchestrator (TACO), based on OpenStack-helm and Airship. TACO is a containerized, declarative, cloud infrastructure lifecycle manager that enables them to provide operators the capability to remotely deploy and manage the entire lifecycle of cloud infrastructure and add-on tools and services by treating all infrastructure like cloud native apps. They deployed it to SKT’s core systems including telco mobile network, IPTV services (5.5 million subscriptions); also for external customers (next generation broadcasting system, VDI, etc). Additionally, the team strongly engaged in community activity in Korea, sharing all of their technologies and experiences to regional communities (OpenStack, Ceph, Kubernetes, etc).
  • StackHPC was formed about five years ago, with a vision of the opportunities offered by open infrastructure for scientific and research computing In that time, the team has grown with the growth of open infrastructure, but has remained true to its roots and everything it does is contributed upstream where possible. As such, the company is not just transformed but entirely inspired by open infrastructure. Their next vision is the software-defined supercomputer. They will be building giant machines, among the most powerful computers in the world, designed to solve some of the most challenging problems faced by science today. We will provide scientists and users with new ways of interacting with high-performance computing to help them get straight to the science.
  • Trendyol Tech, largest e-commerce company in Turkey, is growing exponentially, and scale growth is driven directly by the Trendyol Tech Team. Using a wide variety of open source technology including OpenStack Keystone, they plan to deploy their third region and increase their total core count to around 50,000 by the end of this year.
  • Workday Private Cloud Team has been actively involved in open infrastructure projects by participating in all the open infrastructure summits since the inception of its private cloud team. The team has presented Workday’s stories on scalability, deployment, performance, and operational challenges in the past six OpenStack and Open Infrastructure Summits. Workday engineering recently added support for encryption at rest on Ceph. It contributed to Chef cookbooks used for deploying open infrastructure, submitted bug fixes, and participated in code reviews. Workday has also actively participated in several operators events and meetups. In 2018, Workday organized several open infrastructure meetup events in the East Bay Area. WPC is currently running 43 open infrastructure clusters running across 5 different data centers in the U.S. and Europe. The current number of cores is 422,000. The number of virtual machines running are 30,000 in production. The number of Kubernetes clusters is 70.

Each community member can rate the nominees once by September 28 at 11:59 p.m. Pacific Daylight Time.

Previous winners include Baidu, AT&T, City Network, CERN, China Mobile Network Cloud Integration Team, Comcast, NTT Group, the Tencent TStack Team, and VEXXHOST.

The post Meet the 2020 Superuser Awards nominees appeared first on Superuser.

by Superuser at September 22, 2020 10:33 PM

2020 Superuser Awards Nominee: StackHPC

It’s time for the community to help determine the winner of the 2020 Open Infrastructure Summit Superuser Awards. The Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner after the community has had a chance to review and rate nominees.

Now, it’s your turn.

StackHPC is one of eight nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate them before the deadline September 28 at 11:59 p.m. Pacific Daylight Time.

Rate them here!

Who is the nominee?

StackHPC

How has open infrastructure transformed the organization’s business? 

StackHPC was formed about five years ago, with a vision of the opportunities offered by open infrastructure for scientific and research computing.

In that time, our team has grown with the growth of open infrastructure, but we remain true to our roots and everything we do is contributed upstream where possible.

As such, the company is not just transformed but entirely inspired by open infrastructure.

How has the organization participated in or contributed to an open source project?

  • All of the above, since the moment of our creation!
  • We contribute code, bug reports, reviews, and documentation to open source projects.
  • We participate on the community mailing lists, IRC and Slack channels.
  • We contribute blueprints and implementations for major pieces of new functionality.
  • We present our work at open source conferences and meetups.

What open source technologies does the organization use in its open infrastructure environment?

The main open source technologies in our ecosystem are:

  • OpenStack
  • Ceph
  • Linux
  • Open vSwitch
  • Ansible
  • Kubernetes

What is the scale of your open infrastructure environment?

We work with clients with a broad range of use cases and scales:

  • From tens to thousands of compute nodes.
  • Virtualized, containerized, and bare metal workloads.

What kind of operational challenges have you overcome during your experience with open infrastructure? 

We have worked on issues with performance and scale in a number of areas, including:

  • Provisioning bare metal compute nodes using Ironic at large scale.
  • Telemetry and monitoring of open infrastructure at large scale.
  • High performance virtualization for compute intensive workloads.
  • Using Ansible to manage open infrastructure at large-scale.

How is this team innovating with open infrastructure?

Our next vision is the software-defined supercomputer. We will be building giant machines, among the most powerful computers in the world, designed to solve some of the most challenging problems faced by science today. We will provide scientists and users with new ways of interacting with high-performance computing to help them get straight to the science.

And we will do all this using open infrastructure.

Each community member can rate the nominees once by September 28 at 11:59 p.m. Pacific Daylight Time.

The post 2020 Superuser Awards Nominee: StackHPC appeared first on Superuser.

by Superuser at September 22, 2020 10:32 PM

2020 Superuser Awards Nominee: Trendyol Tech

It’s time for the community to help determine the winner of the 2020 Open Infrastructure Summit Superuser Awards. The Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner after the community has had a chance to review and rate nominees.

Now, it’s your turn.

Trendyol Tech is one of eight nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate them before the deadline September 28 at 11:59 p.m. Pacific Daylight Time.

Rate them here!

Who is the nominee?

Trendyol Tech

How has open infrastructure transformed the organization’s business? 

Trendyol Tech is growing exponentially and scale growth is driven directly by the Trendyol Tech Team. We value culture before anything else and welcome all people who say “We” before “I,” improves continuously, takes ownership of matters, and have a “Let’s Do It” mindset. To cope with the scale growth, we continuously improve ourselves and excel in our Technical skill set.

The aim for us to make the engineering happen inside the company by developing systems and building new projects to grow together. We have a cloud structure which is following the growth of the company day-to-day. The plans for the future, to focus on faster time-to-market.

How has the organization participated in or contributed to an open source project?

As a team, we are attending Openstack Foundation events to stay updated on the current topics,  being sponsors for events such as OpenInfra Days Turkey, and one of our colleagues was a speaker during the event.

Also, we are organizing meetups to share our knowledge and implementations among Tech enthusiasts who would like to learn more about our technologies.

We also contribute to upstream projects and one of our colleagues is a core reviewer of the Kolla project.

Check out some of Trendyol Tech’s contributions here:

What open source technologies does the organization use in its open infrastructure environment?

MAAS, OpenStack, CEPH, Kubernetes, PostgreSQL, Cassandra, Ansible, Terraform, Saltstack, Consul, Kafka, Rabbitmq, Haproxy, Tengine, istio, Grafana, Elasticsearch, Prometheus, Golang, Java, Python.

What is the scale of your open infrastructure environment?

We have two regions up and running and the third region will be deployed soon. We use shared Keystone on a large-scale. The total core count will be ~50,000 by the end of this year.

Here is a brief detail about our services:

  • Kubernetes: 1,040 VM & 100 Clusters in the first region, 2,000 VM & 100 Clusters both in the second and third regions
  • Couchbase: 750 VM & 100 Clusters in each region
  • ElasticSearch: 536 VM & 64 Clusters in the first region, 1000 VM & 120 Clusters both in the second and third regions
  • HA Proxy: 334 VM & 150 Clusters only in the first region
  • PostgreSQL: 300 VM & 60 Clusters in each region
  • Cassandra: 10 VM & 2 Clusters in the first region, 100 VM & 20 Clusters both in the 2nd and 3rd regions
  • Kafka: 103 VM & 12 Clusters in the first region, 20 VM & one Cluster both 2nd and 3rd

What kind of operational challenges have you overcome during your experience with open infrastructure? 

The main challenge is often the Linux distribution itself. We use Ubuntu and try to work with the upstream. Another challenge is the architecture for a large scale cloud. And also some vendors do not meet our automation criteria. We’re going to contribute to large-scale-sig to share our experiences.

Rolling upgrades are not a big issue. All our processes go to heavy testing before production.

How is this team innovating with open infrastructure?

  • The biggest change is transforming the virtualization technology to KVM.
  • We also succeeded in the transformation from a legacy CDN architecture to an object storage powered CDN. With the power of Ceph, the teams can develop cloud native applications.
  • Our DNS environment runs on Designate anymore.
  • Another ongoing process is testing the OpenStack Barbican project for production use.

Each community member can rate the nominees once by September 28 at 11:59 p.m. Pacific Daylight Time.

The post 2020 Superuser Awards Nominee: Trendyol Tech appeared first on Superuser.

by Superuser at September 22, 2020 10:32 PM

2020 Superuser Awards Nominee: Leboncoin

It’s time for the community to help determine the winner of the 2020 Open Infrastructure Summit Superuser Awards. The Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner after the community has had a chance to review and rate nominees.

Now, it’s your turn.

Leboncoin is one of eight nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate them before the deadline September 28 at 11:59 p.m. Pacific Daylight Time.

Rate them here!

Who is the nominee?

Leboncoin

How did your organization get started with Zuul?

We started using Zuul for open source CI two years ago with Zuulv2 and Jenkins. At the beginning, we only used Gerrit and Jenkins, but as new developers joined Leboncoin each new day, this solution was not enough. After some research and a proof-of-concept, we gave Zuul a try, running between Gerrit and Jenkins. In less than a month (and without an official thick documentation) we’ve setup a complete new stack. We ran it for a year before moving to Zuulv3. Zuulv3 is more complex in terms of setup but brings us more features using up-to-date tools like Ansible or OpenStack.

Describe how you’re using it:

We’re using Zuulv3 with Gerrit. Our workflow is close to the OpenStack one. For each review, Zuul is trigger on three “checks” pipelines: quality, integration and build. Once results are correct, we use the gate system to merge the code into repositories and build artifacts.

We are using two small OpenStack clusters (3 CTRL / 3 STRG / 5 COMPUTE) on each datacenter. Zuul is currently setup on all Gerrit projects and some GitHub projects too. Below, is our Zuulv3 infrastructure in production and in the case of datacenter loss.

 

Zuulv3 infrastructure in production.

 

Zuulv3 infrastructure in the case of DC loss.

What is your current scale?

In terms of compute resources, we currently have 480 cores, 1.3To Ram and 80To in our Ceph clusters available. In terms of jobs, we ran around 60,000 jobs per month which means ~around 2,500 jobs per day. Jobs average time is less than five minutes.

 

What benefits has your organization seen from using Zuul?

As Leboncoin is growing very fast (and microservices too 🙂 ), Zuul allows us to ensure everything can be tested and at scale. Zuul is also able to work with Gerrit and GitHub which permits us to open our CI to more teams and workflows.

What have the challenges been (and how have you solved them)?

Our big challenge was to migrate from Zuulv2 to Zuulv3. Even if everything is using Ansible, it was very tiresome to migrate all our CI jobs (around 500 Jenkins jobs). With the help of Zuul guys on IRC, we used some Ansible roles and playbooks used by OpenStack but migration time was about a year.

What are your future plans with Zuul?

Our next steps are to use Kubernetes backend for small jobs like linters and improve Zuul with GitHub.

How can organizations who are interested in Zuul learn more and get involved?

Coming from OpenStack, I think meeting the community at Summits or on IRC is a good start. But Zuul needs better visibility. It is a powerful tool but the information online is limited.

Are there specific features that drew you to Zuul?

Scalability! And also ensuring than every commit merge into the repository is clean and can’t be broken.

What would you request from the Zuul upstream community?

Work on a better integration to Gerrit 3, new nodepool features and provider, a full HA and more visibility on the Internet.

Are you a Zuul user? Please take a few moments to fill out the Zuul User Survey to provide feedback and information around your deployment. All information is confidential to the OpenStack Foundation unless you designate that it can be public

Cover image courtesy of Guillaume Chenuet.

Each community member can rate the nominees once by September 28 at 11:59 p.m. Pacific Daylight Time.

The post 2020 Superuser Awards Nominee: Leboncoin appeared first on Superuser.

by Superuser at September 22, 2020 10:32 PM

2020 Superuser Awards Nominee: LINE

It’s time for the community to help determine the winner of the 2020 Open Infrastructure Summit Superuser Awards. The Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner after the community has had a chance to review and rate nominees.

Now, it’s your turn.

LINE is one of eight nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate them before the deadline September 28 at 11:59 p.m. Pacific Daylight Time.

Rate them here!

Who is the nominee?

LINE

How has open infrastructure transformed the organization’s business? 

First of all, our open infrastructure extremely reduces time to deliver a new LINE service to end users. Delivery time of infrastructure for a new service decreased from one week to 10 mins.

The operational cost of infrastructure has decreased. After building our infrastructure, the infrastructure team can manage all service’s infrastructure like a centralized management system. Each app developer doesn’t need to care for their infrastructure operation, availability, etc., so they can focus on delivering value to end users.

The standardized and open API, e.g. OpenStack API, helps our global offices and engineers. The API helps communication between the app team and infra team, and also enables both the app team and infra team to build more advanced applications and infrastructure on top of it.

How has the organization participated in or contributed to an open source project?

Our open infrastructure project started four years ago when we were trying to fix a bug we hit in OSS community especially in term of scaling. For OpenStack, we shared some issues and its fix we hit in RabbitMQ scaling and operation. We also joined the Large-scale SIG to discuss scaling issue of OpenStack cluster. As part of the Large-scale SIG, one of our activities was to develop and launch oslo.metrics project, which visualizes OpenStack Oslo messaging layer metrics to OpenStack admin, and another was to participate in the OpenDev event as a Program Committee member and session moderator.

In the Kubernetes and related OSS community, we reported bugs and pushed patches upstream. We hit issues in the data plane software when we scaled our cloud. We also contributed to fixing the network scaling issue in FRR community.

What open source technologies does the organization use in its open infrastructure environment?

To build our open infrastructure: OpenStack, Kubernetes, Rancher, Ceph, Kafka, Knative, elasticsearch, MySQL, RabbitMQ, GlusterFS, Jenkins, DroneCI, HAProxy, Ansible, Redis, RabbitMQ, FRR, Nginx, PowerDNS, dnsmasq, libvirt, Linux.

Our open infrastructure services to app developers: OpenStack basic functionalities, Kubernetes, Kafka, Redis, MySQL, elasticsearch, LB, Ceph.

What is the scale of your open infrastructure environment?

We have 50,000+ physical servers including baremetal servers and hypervisors across four regions. The number of VM instance is 67,000+ total, and the largest region has 31,000+ VM instances. 80% of new instance creation is done by OpenStack and virtualized now.

We also manages 350+ Kubernetes clusters and the number of node is 5,400+ nodes. The amount of data managed in Ceph cluster is 17 PB across three regions.

The possible amount of end users of the infrastructure is 180+ million global users.

What kind of operational challenges have you overcome during your experience with open infrastructure? 

We use OpenStack Keystone as a unified authorization system in our infrastructure and introduced “Request-id” concept of OpenStack to non-OpenStack OSS management.

Our infrastructure serves lots of managed services like Kubernetes clusters, MySQL, Elasticsearch in addition to VMs. All the services run as microservice architecture (MSA). Keystone is an identity service in MSA. By applying the Keystone concept to non-OpenStack services, it’s really easy to integrate all services as our infrastructure, and app developers can operate all of our open infrastructure by Keystone’s token.

The purpose of the “Request-id” concept is to track user’s requests among microservices. By adapting the “Request-id” concept, it makes it easier for the infra operator to investigate the problem when a user request fails and investigation is required.

How is this team innovating with open infrastructure?

  • Some core components, e.g. OpenStack Keystone, are deployed geographically different area for the DR purpose.
  • Some business and security policy are integrated with our open infrastructure in API layer.
  • Unified infra back office GUI is developed to managed all our infrastructure services.
  • Standardized infra control node operations are realized by deploying the node on to shared Kubernetes cluster.
  • CLOS network architecture is introduced in order to scale and to reduce operational cost.
  • SRv6 network mechanism is introduced to handle multitenancy network for some special projects.

Each community member can rate the nominees once by September 28 at 11:59 p.m. Pacific Daylight Time.

The post 2020 Superuser Awards Nominee: LINE appeared first on Superuser.

by Superuser at September 22, 2020 10:32 PM

About

Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.

Subscriptions

Last updated:
December 03, 2020 11:07 AM
All times are UTC.

Powered by:
Planet