June 19, 2019

Aptira

How to Install ONAP on Kubernetes using Cloudify

Aptira: Install ONAP on Kubernetes unsing Cloudify Puzzle

This “How to Install ONAP on Kubernetes using Cloudify” tutorial covers the process required to install a Kubernetes cluster using the Cloudify orchestrator and installing ONAP atop the Kubernetes cluster using ONAP’s Operations Manager.

The Kubernetes cluster is configured atop OpenStack, and its presumed that OpenStack is already installed.

 

ONAP Software and Hardware Requirements:

For ONAP Casablanca the hardware and software requirements are as follows:

Software Version
Kubernetes 1.11.5
Helm 2.9.1
kubectl 1.11.5
Docker 17.03.x
Resource Size
RAM 224GB
HardDisk 160GB
vCores 112

The hardware resource requirements are for a full installation of ONAP, so the requirements differ with components need to be installed.

Cloudify Manager Installation:

  1. Download the Centos7 Cloud image and upload it to OpenStack images.
    The Centos image can be downloaded from: https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2
  1. Spawn an OpenStack instance with Centos7 image.
    System prerequisites details can be found on the following page:     https://docs.cloudify.co/4.6/install_maintain/installation/prerequisites/
  1. Install Cloudify Manager, with steps mentioned on page https://docs.cloudify.co/4.6/install_maintain/installation/installing-manager/
  1. Once the Cloudify Manger installation is complete, the Cloudify UI can be accessed, using http://<cloudify_manager_public_IP> with credentials mentioned in config file.

Setting Up Kubernetes Cluster with Cloudify:

  1. Login to the Cloudify manager machine and add the Cloudify secrets needed for setting up Kubernetes cluster:


$ cfy secrets create centos_core_image -s centos7_image
$ cfy secrets create large_image_flavor -s xlarge
$ cfy secrets create keystone_username -s openstackadmin
$ cfy secrets create keystone_password -s its_a_secret
$ cfy secrets create keystone_tenant_name -s onap
$ cfy secrets create keystone_url -s https://cloud.openstack.com:5000/v3
$ cfy secrets create region -s RegionOne
$ cfy secrets create agent_key_private -f secret.txt
$ cfy secrets create private_subnet_name -s onap_subnet
$ cfy secrets create private_network_name -s onap_network
$ cfy secrets create public_network_name -s external_network
$ cfy secrets create router_name -s onap_router

  1. Clone the ONAP OOM (ONAP Operations Manager) Casablanca branch on Cloudify manager:


$ git clone https://gerrit.onap.org/r/oom -b casablanca

  1. Update Software/Hardware stack if needed.
    The existing Cloudify OpenStack blueprint creates a 7 node cluster. The number of nodes can be modified by updating the default_instances value:


vim /oom/TOSCA/kubernetes-cluster-TOSCA/openstack-blueprint.yaml
------
properties:
default_instances: 6

The software Stack installed on Kubernetes can be modified by updating the packages section:

$ vim oom/TOSCA/kubernetes-cluster-TOSCA/imports/cloud-config.yaml
-------
packages:

- [docker-engine, 17.03.0.ce-1.el7.centos]
- [kubelet, 1.11.5-0]
- [kubeadm, 1.11.5-0]
- [kubectl, 1.11.5-0]
- [kubernetes-cni, 0.6.0-0]

Once the all the required changes are done,

  1. Upload the blueprint to the Cloudify manager.
  2. Create a deployment with the blueprint uploaded.
  3. Execute install workflow on the deployment created.


$ cd oom/TOSCA/kubernetes-cluster-TOSCA
$ cfy blueprints upload -b onap_K8S openstack-blueprint.yaml
$ cfy deployments create –b onap_K8S onap_K8S_Dep
$ cfy exec start install –d onap_K8S_Dep

Upon successful installation, the K8S cluster will created. The details of this cluster can be retrieved using kubectl commands:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
server- centos7-k8sd2-kubernetes-master-host.novalocal Ready master 40d v1.11.5
server- centos7 -k8sd2-kubernetes-node-host.novalocal Ready <none> 40d v1.11.5
server- centos7 -k8sd2-kubernetes-node-host.novalocal Ready <none> 40d v1.11.5
server- centos7 -k8sd2-kubernetes-node-host.novalocal Ready <none> 40d v1.11.5
server- centos7 -k8sd2-kubernetes-node-host.novalocal Ready <none> 40d v1.11.5
server- centos7 -k8sd2-kubernetes-node-host.novalocal Ready <none> 40d v1.11.5
server- centos7 -k8sd2-kubernetes-node-host.novalocal Ready <none> 40d v1.11.5

OpenStack environment variables (example: Tentant/Image/…), can be updated by logging to the cloudify UI: under System Resources/Secret Store Management/Update Secret.

ONAP Installation Pre-requisites:

Once the Kubernetes installation is complete, now we are ready to start the ONAP Installation.

If the Kubernetes cluster has been installed with the above instructions, Helm installation and setting up NFS share can be skipped as Cloudify will setup both of these. If any existing Kubernetes cluster is being used, then follow all instructions.

Helm Installation on Kubernetes master:

Helm is used by ONAP OOM for package and configuration management.

Helm is the package manager for Kubernetes, which helps in managing Kubernetes applications:


$ wget http://storage.googleapis.com/kubernetes-helm/helm-v2.9.1-linux-amd64.tar.gz
$ tar -zxvf helm-v2.9.1-linux-amd64.tar.gz
$ sudo mv linux-amd64/helm /usr/bin/helm

With this helm client is installed, execute the following commands to setup the Helm server:

$ kubectl -n kube-system create sa tiller
$ kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
$ helm init --service-account tiller

Verify the Helm installation

$ helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

Setting UP Shared NFS storage across worker nodes:

OOM currently requires that all kubelets mount a shared NFS directory for data storage, so pods across the worker nodes can read the config and config data which is persistent across the node reboots.

NFS on Kubernetes master Node


$cat /etc/exports
/dockerdata-nfs *(rw,no_root_squash,no_subtree_check)

On all worker nodes of the Kubernetes cluster, mount the above NFS directory by making entry in /etc/fstab.

Mounting NFS on worker Node


$ cat /etc/fstab
192.168.0.1:/dockerdata-nfs /dockerdata-nfs nfs auto 0 0
$ mount -a

Generate Openstack Encrypted Password:


$oom/kubernetes/so/resources/config/mso# echo -n "" | openssl aes-128-ecb -e -K `cat encryption.key` -nosalt | xxd -c 256 -p

ONAP Installation:

ONAP is using OOM as the preferred and recommended way to install ONAP. OOM is a set of Helm charts for Kubernetes, which deploys and manages ONAP.

ONAP Helm charts are hosted locally, execute the following commands to start local helm server and add local repo:


$ helm init
$ helm serve &
$ helm repo add local http://127.0.0.1:8879
$ helm repo list

Clone the OOM code from required branch:


git clone -b casablanca http://gerrit.onap.org/r/oom

ONAP has individual helm charts for all its components. There is a parent chart called ONAP which is used when changes across all components is needed. Update the values.yaml at oom/kubernetes/onap, sample updated values are:


# image pull policy pullPolicy: IfNotPresent
so:
enabled: true
replicaCount: 1
liveness:
# necessary to disable liveness probe when setting breakpoints
# in debugger so K8s doesn't restart unresponsive container
enabled: true
# so server configuration config:
# message router configuration
dmaapTopic: "AUTO"
# openstack configuration
openStackUserName: " openstackadmin "
openStackRegion: "RegionOne"
openStackKeyStoneUrl: "https://cloud.openstack.com:5000/v3 "
openStackServiceTenantName: " onap "
openStackEncryptedPasswordHere: "d05c5a37ab7af6dc52f3660bf20053a1"

Update oom/kubernetes/robot/values.yaml with VIM (OpenStack) details, robot scripts will be useful to demonstrate the demos(ex: vFW demo).

Execute make command to prepare and save helm charts:


$ cd oom/kubernetes
$ make all
1 chart(s) linted, no failures
Successfully packaged chart and saved it

Listing Helm Repository


$ helm search -l
NAME CHART VERSION APP VERSION DESCRIPTION
local/aaf 3.0.0 ONAP Application Authorization Framework
local/aai 3.0.0 ONAP Active and Available Inventory
local/appc 3.0.0 Application Controller
local/cassandra 3.0.0 ONAP cassandra
local/clamp 3.0.0 ONAP Clamp
local/cli 3.0.0 ONAP Command Line Interface
local/common 3.0.0 Common templates for inclusion in other charts
local/consul 3.0.0 ONAP Consul Agent
local/contrib 3.0.0 ONAP optional tools
local/controller-blueprints 3.0.0 Controller Blueprints Micro Service
local/dcaegen2 3.0.0 ONAP DCAE Gen2
local/dgbuilder 3.0.0 D.G. Builder application
local/dmaap 3.0.0 ONAP DMaaP components
local/esr 3.0.0 ONAP External System Register
local/log 3.0.0 ONAP Logging ElasticStack
local/mariadb-galera 3.0.0 Chart for MariaDB Galera cluster
local/mongo 3.0.0 MongoDB Server
local/msb 3.0.0 ONAP MicroServices Bus
local/multicloud 3.0.0 ONAP multicloud broker
local/music 3.0.0 MUSIC - Multi-site State Coordination Service
local/mysql 3.0.0 MySQL Server
local/nbi 3.0.0 ONAP Northbound Interface
local/network-name-gen 3.0.0 Name Generation Micro Service
local/onap 3.0.0 Casablanca Open Network Automation Platform (ONAP)

The setup of the Helm repository is a one-time activity. In case changes are needed, update the values.yaml and re-run ‘make all’.

Once the make command succeeds, we are ready to deploy ONAP. ONAP installation is completed with master helm chart ONAP, which will install all the selected ONAP components.

Start the ONAP installation with command:


helm install local/onap --name onap --namespace onap

The first time installation will take couple of hours, as lot of images need to be downloaded from the Internet.

Checking the status of Installation and Verify

Check the status of the ONAP installation with kubectl get pods commands, check all pods in Running/Completed state:


$ kubectl get pods -n onap
NAME READY STATUS RESTARTS AGE
dep-config-binding-service-7b68dfd444-6nwlk 2/2 Running 0 11d
dep-dcae-datafile-collector-b67b74598-b7qmx 2/2 Running 0 11d
dep-dcae-hv-ves-collector-6b4bf7f5db-z9s7f 2/2 Running 0 11d
dep-dcae-prh-78b579db5f-zbkf9 2/2 Running 0 11d
dep-dcae-snmptrap-collector-6455574cc4-9xf44 1/1 Running 0 11d
dep-dcae-tca-analytics-84f56d4cbc-db76r 2/2 Running 1 11d
dep-dcae-ves-collector-6c87c689cf-z7b5r 2/2 Running 0 11d
dep-deployment-handler-6644fc65b9-q69px 2/2 Running 0 11d
dep-inventory-9d66fbfd-s4kfv 1/1 Running 0 11d
dep-policy-handler-8944dd474-kncpg 2/2 Running 0 11d
dep-pstg-write-77c89cb8c4-mgd62 1/1 Running 0 11d
dep-service-change-handler-7b544f558d-gktk9 1/1 Running 0 11d
onap-aaf-cm-9545c9f77-v2nsq 1/1 Running 0 11d
onap-aaf-cs-84cbf5d4ff-x86mz 1/1 Running 0 11d
onap-aaf-fs-65ccb9db74-5cpzm 1/1 Running 0 11d
onap-aaf-gui-7c696c4cb6-lfkv4 1/1 Running 0 11d
onap-aaf-hello-747fbc7bc7-g98cs 1/1 Running 0 11d
onap-aaf-locate-788d8d7f6d-tmk2v 1/1 Running 0 11d

Executing Basic Robot tests

Execute the oom/kubernetes/robot/ete-k8s.sh script to test the basic functionality of the ONAP deployment:

[centos@k8s-master1 robot]$ ./ete-k8s.sh onap health
Executing robot tests at log level TRACE
==============================================================================
Testsuites
==============================================================================
Testsuites.Health-Check :: Testing ecomp components are available via calls.
==============================================================================
Basic A&AI Health Check | PASS |
------------------------------------------------------------------------------
Basic AAF Health Check | PASS |
------------------------------------------------------------------------------
Basic AAF SMS Health Check | PASS |
------------------------------------------------------------------------------
Basic APPC Health Check | PASS |
------------------------------------------------------------------------------
Basic CLI Health Check | PASS |
------------------------------------------------------------------------------
Basic CLAMP Health Check | PASS |
------------------------------------------------------------------------------
Basic DCAE Health Check | PASS |
------------------------------------------------------------------------------
Basic DMAAP Data Router Health Check | PASS |
------------------------------------------------------------------------------

Accessing the ONAP portal

From the machine where the portal need to be accessed, the etc/hosts file need to be updated with component name and host IP where pod is hosted.

Sample /etc/hosts as follows:


192.168.127.151 portal.api.simpledemo.onap.org
192.168.127.153 vid.api.simpledemo.onap.org
192.168.127.154 sdc.api.fe.simpledemo.onap.org
192.168.127.154 portal-sdk.simpledemo.onap.org
192.168.127.151 policy.api.simpledemo.onap.org
192.168.127.151 aai.api.sparky.simpledemo.onap.org
192.168.127.153 cli.api.simpledemo.onap.org
192.168.127.153 msb.api.discovery.simpledemo.onap.org

Access the ONAP portal with: https://portal.api.simpledemo.onap.org:30225/ONAPPORTAL/login.htm using the user credentials:

Username: demo

Password: demo123456!

Aptira ONAP Open Network Automation Platform Portal

That’s it! You’ve now installed ONAP on Kubernetes using Cloudify. If you get stuck with this, check out our training courses to learn more about technologies including Kubernetes, as well as a range of Open Networking techniques. Alternatively, contact our Solutionauts for help.

The post How to Install ONAP on Kubernetes using Cloudify appeared first on Aptira.

by Aptira at June 19, 2019 08:21 AM

June 18, 2019

OpenStack Superuser

Check out these open infrastructure project updates

If you’re interested in getting up to speed on what’s next for open infrastructure software, the project update videos from the Open Infra Summit Denver are available now.

In them you’ll hear from the project team leaders (PTLs) and core contributors about what they’ve accomplished, where they’re heading for future releases plus how you can get involved and influence the roadmap.

You can find the complete list of them on the video page. You can also get a complete overview of the projects (and how to get involved) on the OpenStack project navigator. For more on projects independent from the OSF (Airship, Kata Containers and Zuul) follow the links to those sites.

Some project updates that you won’t want to miss, in alphabetical order:

Airship

Airship is a collection of loosely coupled, interoperable open-source tools that are nautically themed.

Ironic

Bare metal provisioning service implements services and associated libraries to provide massively scalable, on demand, self-service access to compute resources, including bare metal, virtual machines and containers.

Kata Containers

Kata Containers bridges the gap between traditional VM security and the lightweight benefits of traditional Linux* containers.

Heat

Heat orchestrates the infrastructure resources for a cloud application based on templates in the form of text files that can be treated like code. Heat provides both an OpenStack-native ReST API and a CloudFormation-compatible Query API.

Mistral

Mistral is the OpenStack workflow service. It aims to provide a mechanism to define tasks and workflows without writing code, manage and execute them in the cloud environment.

Octavia

Octavia is an open source, operator-scale load balancing solution designed to work with OpenStack.

Swift

Swift is a highly available, distributed, eventually consistent object/blob store. Organizations can use Swift to store lots of data efficiently, safely, and cheaply. It’s built for scale and optimized for durability, availability, and concurrency across the entire data set. Swift is ideal for storing unstructured data that can grow without bound.

Zuul

Zuul drives continuous integration, delivery and deployment systems with a focus on project gating and interrelated projects.

The post Check out these open infrastructure project updates appeared first on Superuser.

by Superuser at June 18, 2019 02:05 PM

Aptira

DevOps Training: Tools and Practices

Aptira DevOps Development-as-a-Service Globe Follow The Sun

DevOps: Make Your Job Easier

We live in a world of software-defined everything. Our enterprise customers know that innovative applications of software are critical for transforming and growing businesses. In order to support the innovation process, DevOps processes can help your business shift application products to market faster and easier.

How can DevOps to make your job easier?

  • Custom Development: Creation of customised tools, designed specifically to suit your unique requirements.
  • Simplified Solutions: Reduce complexity of your systems, in-turn reducing reliance on resources and overall cost.
  • Integration: Simple integration with your existing systems, reducing the learning curve for new systems.
  • Automation: Fast, repeatable deployments, minimised error rates and increased flexibility.

Custom development is one of our pillars of competence, and we have completed projects for several leading storage vendors, government funded initiatives and education/research consortia. Our team can develop a solution that will drive innovation or we can work alongside you to provide mentoring and lead your team to achieve your development goals.

But if you’d rather do it yourself – we can help get you on your way.

DevOps Training

Using our experience from recent projects, our Solutionauts have put together a 2 day DevOps course. This course is designed to introduce participants to the concepts, tools and practices of DevOps, Version Control and Automation.

This course is ideal for System administrators and DevOps professionals who want to understand and use DevOps tools in a practical environment. Ideally for those who haven’t had much experience with the toolsets or paradigms – not seasoned DevOps pros. There will be plenty of opportunities to get hands on experience, with almost half of the course provided as hands-on labs. For more information, check out the course outline here. 

As with everything we do, our courses are fully customised to suit your requirements. So if you’re after some more advanced DevOps techniques – check out our comprehensive list of DevOps topics. Or chat with us so we can create a customised course to meet your specific learning objectives. Our DevOps topics include:

We’re also offering end of financial year discounts on all of our technology training – including DevOps. This discount applies to pre-paid training, booking multiple courses, bundling with your hardware, software licences and any of our services (including DevOps and Custom Development). So if you’re looking to upgrade your infrastructure, or learn how to manage it more efficiently – now is the time.

This deal is running until the end of June, but can be used at any time during the next 12 months. For more information on our DevOps services, or to get the best discount for you – chat with our Solutionauts today.

Become more agile.
Get a tailored solution built just for you.

Find Out More

The post DevOps Training: Tools and Practices appeared first on Aptira.

by Jessica Field at June 18, 2019 01:53 PM

June 17, 2019

OpenStack Superuser

Running Relax-and-Recover to save your OpenStack deployment

Relax-and-Recover (ReaR) is a pretty impressive disaster recovery solution for Linux. ReaR creates both a
bootable rescue image and a backup of the associated files you choose.

When doing disaster recovery of a system, this Rescue Image plays the files back from the backup and so in the twinkling of an eye the latest state.

Various configuration options are available for the rescue image. For example, slim ISO files, USB sticks or even images for PXEservers are generated. As many backup options are possible.Starting with a simple archive file (eg * .tar.gz), various backup technologies such as IBM Tivoli Storage Manager (TSM), EMC NetWorker (Legato), Bacula or even Bareos can be addressed.

The ReaR written in Bash enables the skillful distribution of Rescue Image and if necessary archive file via NFS, CIFS (SMB) or another transport method in the network. The actual recovery process then takes place via this transport route. In this specific case, due to the nature of the OpenStack deployment, we will choose those protocols that are allowed by default in the Iptables rules (SSH, SFTP in particular).

But enough with the theory, here’s a practical example of one of many possible configurations. We’ll apply this specific use of ReaR to recovera failed control plane after a critical maintenance task (like an upgrade.)

Prepare the undercloud backup bucket

We need to prepare the place to store the backups from the overcloud. From the undercloud, check you have enough space to make the backupsand prepare the environment. We’ll also create a user in the undercloud with no shell access to be able to push the backups from thecontrollers or the compute nodes.

groupadd backup
mkdir /data
useradd -m -g backup -d /data/backup backup
echo "backup:backup" | chpasswd
chown -R backup:backup /data
chmod -R 755 /data

Run the backup from the overcloud nodes

#Install packages
sudo yum install rear genisoimage syslinux lftp -y

#Configure ReaR
sudo tee -a "/etc/rear/local.conf" > /dev/null <<'EOF'
OUTPUT=ISO
OUTPUT_URL=sftp://backup:backup@undercloud-0/data/backup/
BACKUP=NETFS
BACKUP_URL=sshfs://backup@undercloud-0/data/backup/
BACKUP_PROG_COMPRESS_OPTIONS=( --gzip )
BACKUP_PROG_COMPRESS_SUFFIX=".gz"
BACKUP_PROG_EXCLUDE=( '/tmp/*' '/data/*' )
EOF

Now run the backup, this should create an ISO image in the undercloud node (/data/backup/).

sudo rear -d -v mkbackup

Now, simulate a failure xD

# sudo rm -rf /

After the ISO image is created, we can proceed to verify we can restore it from the hypervisor.

Prepare the hypervisor

# Install some required packages
# Enable the use of fusefs for the VMs on the hypervisor
setsebool -P virt_use_fusefs 1

sudo yum install -y fuse-sshfs

# Mount the Undercloud backup folder to access the images
mkdir -p /data/backup
sudo sshfs -o allow_other root@undercloud-0:/data/backup /data/backup
ls /data/backup/*

Stop the damaged controller node

virsh shutdown controller-0

# Wait until is down
watch virsh list --all

# Backup the guest definition
virsh dumpxml controller-0 > controller-0.xml
cp controller-0.xml controller-0.xml.bak

Now, we need to change the guest definition to boot from the ISO file.

Edit controller-0.xml and update it to boot from the ISO file.

Find the OS section, add the cdrom device and enable the boot menu.

<os>
<boot dev='cdrom'/>
<boot dev='hd'/>
<bootmenu enable='yes'/>
</os>

Edit the devices section and add the CDROM.

<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/data/backup/rear-controller-0.iso'/>
<target dev='hdc' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='1' target='0' unit='0'/>
</disk>

Update the guest definition.

virsh define controller-0.xml

Restart and connect to the guest

virsh reset controller-0
virsh console controller-0

You should be able to see the boot menu to start the recover process. Select “recover controller-0” and follow the instructions.


You should see a message like:

Welcome to Relax-and-Recover. Run "rear recover" to restore your system !

RESCUE controller-0:~ # rear recover

The image restore should progress quickly.

Continue to see the restore evolution.

Now, each time you reboot the node will have the ISO file as the first boot option so it’s something we need to fix. In the meantime, let’s check to see if the restore worked.

Reboot the guest booting from the hard disk.

Now we can see that the guest VM started successfully.

Now we need to restore the guest to its original definition, so from the hypervisor we need to restore the controller-0.xml.bakfrom the hypervisor
from the hypervisor file we created.

#From the Hypervisor
virsh shutdown controller-0
watch virsh list --all
virsh define controller-0.xml.bak
virsh start controller-0

Enjoy.

Considerations

  • Space
  • Multiple protocols supported (but we might then to update firewall rules, that’s why I prefered SFTP)
  • Network load when moving data
  • Shutdown/Starting sequence for HA control plane
  • Whether the data plane requires backup
  • User workloads should be handled by a third party backup software

Got feedback?

Visit this post’s issue page on GitHub.

Carlos Camacho is a software engineer from Madrid, Spain who works at Red Hat. This post first appeared on his blog.

 

Superuser is always interested in community content, get in touch: editorATopenstack.org

The post Running Relax-and-Recover to save your OpenStack deployment appeared first on Superuser.

by Carlos Camacho at June 17, 2019 03:54 PM

Aptira

Evaluating Red Hat and Mirantis OpenStack for Nokia Virtual Network Functions (VNF)

Aptira Virtual Network Functions (VNF)

One of Australia’s leading service providers is moving their network operations to OpenStack, and would like us to evaluate Telco workloads on their internal cloud systems – specifically Nokia Virtual Network Functions (VNFs).


The Challenge

The client had purchased hardware from EMC but did not have the expertise required to deploy and manage a private Cloud system. Therefore, Aptira were contracted to set up an on-premises OpenStack and help them install an orchestrator to run virtual network services on it.


The Aptira Solution

Aptira designed and deployed Mirantis OpenStack on their hardware using Fuel and integrated the OpenStack instance with VMware. This configuration enabled OpenStack to launch virtual machines in a KVM or ESXI hypervisor. After OpenStack had been setup and successfully operating, the Nokia NFVs were deployed on OpenStack for testing.

The client also wanted Aptira to repeat the same process on Redhat’s OSP, but this instance does not require integration with VMware.


The Result

We were able to successfully assist the customer with evaluating both Mirantis and Red Hat versions of OpenStack, as well as the feasibility of running Nokia Network Function Virtualisations on OpenStack.

Our staff has provided several demonstration sessions to their internal team to show how the system works and how each component works with each other. This will enable them to successfully operate and manage the system going forward. We also provided in-depth OpenStack training to increase their OpenStack knowledge, ensuring the ongoing success of this platform. You can find more info on this specific training outcome here.

Finally, we documented everything in this assignment and provided this report to their architecture for their reference of future architectural design.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Evaluating Red Hat and Mirantis OpenStack for Nokia Virtual Network Functions (VNF) appeared first on Aptira.

by Aptira at June 17, 2019 01:27 PM

Cloudwatt

Tenant description in the Cloudwatt dashboard

If you have multiple tenants, it might sometimes be difficult to know to which project a tenant might belong to in the tenant selection list.

It is now possible to display the tenant description instead of it’s name in the Cloudwatt Dashboard ( https://console.cloudwatt.com ), if a description is available.

To do so, from the Cloudwatt Dashboard click on your username then on “Settings”:

settings

or go directly here: https://console.cloudwatt.com/settings/

Then click on the “Display the Tenant description (if available) instead of name in the Tenant selector” checkbox and click on “Save”:

paramètre à cocher

If you have a tenant description recorded it will be displayed instead of the tenant name in the tenant selector list.

To modify this description or to create one, go to the tenant selection in the Cloudwatt Cockpit here: https://portail.cloudwatt.com/cockpit/#/tenants

Then click on the desired tenant and on the “Tenant information tab” and on the icon beside the tenant description:

cockpit

Update the description and then click on “UPDATE DESCRIPTION”:

cockpit

In the Cloudwatt dashboard you will see the description instead of the name.

tenant selector vue

by Yves-Gwenael at June 17, 2019 12:00 AM

June 14, 2019

Chris Dent

More on Maintainership

This is a followup to some of the thoughts about being a "maintainer" raised in OpenStack Denver Summit Reflection.

Of the work that I've done in OpenStack, what has felt most relevant to me is the work that has not been related to feature development. In the context of being a "professional maintainer" raised in the post linked above, this makes a lot of sense.

The work that has felt relevant has improved or optimized either the existing aspects of the systems themselves, or the process of creating those systems. For example, Gabbi was created because it was too difficult to test and understand API changes in the Telemetry project. Adding support for environment variables in oslo.config was driven by the placement container playground series which was essentially an exercise in making sure Placement was easy to test and painless to use in experiments.

The playing around that drove that work has exposed (and fixed) more bugs and performance issues in Placement, in advance of common use, than real world use.

More than a year ago I wrote a draft of how I wanted the culture of the Placement project to be different, once it was extracted from Nova. Extraction happened near the start of this year. It's interesting to look back at that draft now.

In it, I discussed why the historical culture of Nova sometimes seemed "mean in at least three senses of the word: unkind, miserly, and poor in quality":

There are many factors that have led to this, but one of them is the way in which the responsibilities of core reviewers have evolved to defend stability and limit change while simultaneously being responsible for creating much of the design and code.

If placement, once extracted, wants to have a different culture it needs to shrug off this history and actively work to do things differently. While becoming a core reviewer will probably be partly the result of creating (and reviewing) features, working on features should be low down on the priority list once someone is a core. Priorities should be:

  1. Review changes (from others, primarily users)
  2. Fix existing code and tech debt
  3. Refactor and refresh code and documentation

Being oriented towards improving existing code rather than creating features has a few effects:

  • It leaves a clear and inviting doorway for others to participate.
  • It codifies regular refactoring.
  • It provides clear safety for incoming new changes to be accepted based on being "good enough" because there's awareness that fixing existing code is part of the process.

That sounds a great deal like what I wrote last week, despite having completely forgotten about the draft from last year: "It probably means writing less code for features and more code (and docs) to enable other people to write features."

For that to be possible, a maintenance-oriented person needs significant amounts of time available for reflection and experimentation. Headspace in which to be able to think. In my too long experience as a software professional, very few software development and/or support teams have recognized the value of headspace, requiring those who want to reflect and experiment to do it outside their normal eight hours or not at all. Hours which are often over-packed with adherence to bastardized and misunderstood versions of agility, despite clear evidence of the value that comes from space to think and discover.

by Chris Dent at June 14, 2019 02:15 PM

Aptira

Ceph Training + Software Licence Discounts

Aptira Ceph Training and Software Licenses

Ceph Storage Solutions

We love Ceph. We’ve used it for several projects recently, including to build a very high-performing and scalable storage landscape for the Swinburne University of Technology, as well as a custom internal lab to help our Solutionauts work more efficiently. Ceph has become a core component in helping us to build cost efficient, scalable and storage software solutions.

Why do we love Ceph so much?

  • Increased control over data distribution and replication strategies
  • Consolidation of object storage and block storage
  • Fast provisioning of boot-from-volume instances using thin provisioning
  • CephFS Support
  • We love Open Source 🙂

For enterprise-level businesses, the demand for additional data storage is growing too fast for traditional storage options to continue to be an affordable solution. Continuing down this path means you’ll be forced to increase your budget dramatically in order to keep up with your data needs.

However, there is another answer—SUSE Enterprise Storage. This intelligent software defined storage solution, powered by Ceph technology, enables you to transform your enterprise storage infrastructure and reduce costs while providing unlimited scalability. The result is an affordable and easy-to-manage enterprise storage solution. We teamed up with SUSE to provide seamless storage solutions. Aptira has been a partner since 2016, – and is SUSE’s first solution partner in APJ – sharing our expertise across a range of different Open Source projects.

You should consider Ceph if you want to manage your object and block storage within a single system, or if you want to support fast boot-from-volume.

Ceph Training

If you’d like to learn more about Ceph, we’re offering a 2 day Ceph training course, which covers all the core features of Ceph Storage Essentials, including:

  • Ceph Node Types
  • Ceph Architecture
  • Cluster Maps
  • Object Placement
  • Background Producers
  • Installation
  • Customising Ceph
  • and more.

We’re also offering end of financial year discounts on all of our technology training – including Ceph. This discount applies to pre-paid training, booking multiple courses, bundling with your hardware, software licences (such as SUSE) and any of our services. So if you’re looking to upgrade your storage infrastructure, or learn how to manage it more efficiently – now is the time.

This deal is running until the end of June, but can be used at any time during the next 12 months. For more information on our Ceph Storage Solutions, or to get the best discount for you – chat with our Solutionauts today.

Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Ceph Training + Software Licence Discounts appeared first on Aptira.

by Jessica Field at June 14, 2019 01:32 PM

Chris Dent

Placement Update 19-23

19-23. I'll be travelling the end of next week so there will be no 19-24.

Most Important

We keep having interesting and helpful, but not yet fully conclusive, discussions related to the spec for nested magic. The discussion on the spec links back to a few different IRC discussions. Part of the reason this drags on so much is that we're trying to find a model that is internally consistent in placement and generally applicable while satisfying the NUMA requirements from Nova, while not requiring either Placement or Nova to bend over backwards to get things right, and while not imposing additional complexity on simple requests.

(And probably a few other whiles...)

If you have thoughts on these sorts of things, join that review. In the meantime there are plenty of other things to review (below).

What's Changed

  • A blocker migration for incomplete consumers has been added, and the inline migrations that would guard against the incompleteness have been removed.

  • CORS configuration and use in Placement has been modernized and corrected. You can now, if you want, talk to Placement from JavaScript in your browser. (This was a bug I found while working on a visualisation toy with a friend.)

  • Result sets for certain nested provider requests could return different results in Python versions 2 and 3. This has been fixed.

  • That ^ work was the result of working on implementing mappings in allocations which has merged today as microversion 1.34

Specs/Features

  • https://review.opendev.org/654799 Support Consumer Types. This has some open questions that need to be addressed, but we're still go on the general idea.

  • https://review.opendev.org/662191 Spec for nested magic 1. The easier parts of nested magic: same_subtree, resourceless request groups, verbose suffixes (already merged as 1.33). See "Most Important" above.

Some non-placement specs are listed in the Other section below.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 20 (0) stories in the placement group. 0 (0) are untagged. 3 (0) are bugs. 6 (2) are cleanups. 11 (0) are rfes. 2 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

1832814: Placement API appears to have issues when compute host replaced is an interesting bug. In a switch from RDO to OSA, resource providers are being duplicated because of a change in node name.

osc-placement

osc-placement is currently behind by 11 microversions.

There are no changes that have had attention in less than 4 weeks. There are 4 other changes.

Main Themes

Nested Magic

The overview of the features encapsulated by the term "nested magic" are in a story and spec.

There is some in progress code, mostly WIPs to expose issues and think about how things ought to work:

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A spec has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound.

Cleanup

We continue to do cleanup work to lay in reasonable foundations for the nested work above. As a nice bonus, we keep eking out additional performance gains too. There are two new stories about some minor performance degradations:

Gibi discovered that osprofiler wasn't working with placement and then fixed it:

Thanks to Ed Leafe for his report on the state of graph database work.

Other Placement

Miscellaneous changes can be found in the usual place.

There are three os-traits changes being discussed. And one os-resource-classes change.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

If you, like me, use this collection of information to drive what to do early in the next week, you might like placement-reviewme which brute force loads all the links from the HTML version of this in tabs in your browser.

by Chris Dent at June 14, 2019 01:03 PM

June 13, 2019

OpenStack Superuser

Takeaways from the truly awesome OpenStack Days CERN

GENEVA — Awesome! The word is often overused on things that don’t really deserve it, but OpenStack Day CERN was the literal definition of awesome.

The event was packed with the most inspiring modern science being performed today. We heard from CERN, the Square Kilometer Array (SKA), NASA and the Wellcome Sanger Institute among others. Most of the talks were given by various IT folks at these organizations. While they may not be scientists themselves, hearing them talk about the part they play in the furthering of our communal knowledge of the world around us and how we exist was one of the most engaging OpenStack/open infra events I’ve ever attended.

awesome

/ˈɔːs(ə)m/

: adjective;

extremely impressive or daunting; inspiring awe.

There were over 200 people at the event from a variety of backgrounds. Some were longtime OpenStack users, others new to OpenStack, some returning to OpenStack, others in the OpenStack upstream community interested in its application to science, and others that were just lovers of CERN. There was something for everyone at the event.

Kendall Nelson and OSF co-worker Ildiko Vancsa at the LHC.

After OSF executive director Jonathan Bryce opened with a keynote about collaborating without borders, Ian Bird and Tim Bell from CERN’s IT department were up next. They offered an inside look into how they handle huge streams of data coming in from various experiments (ATLAS, ALICE, etc.) — around 90 petabytes per year! — and pinpoint the notable parts of the data quickly so that they can write that to tape (yes, it’s 2019, and they’re writing to tape.)

As the experiments progress over the coming years, data streaming in will only increase exponentially so they need to find new ways to handle, process and pay for that growth. They weren’t the only ones facing this issue though, SKA has similar concerns. Both organizations are placing their faith in OpenStack.

NASA and the Wellcome Sanger Institute also gave talks on how their users are making use of OpenStack in their data centers. Their science covers everything from counting trees in the Sahara, analyzing weather data, to genotyping 25 different species of plants and animals native to Great Britain. (This last research project to celebrate the 25th anniversary of the Wellcome Sanger Institute.)

I sat there enthralled–for hours–listening to these speakers tell stories of their users and how OpenStack is accelerating science, hitting on the theme of the event. I was constantly in awe of the brilliance of the scientists making use of OpenStack and incredibly proud to be a part of helping their research and efforts move forward.

It struck me that code I’ve written is being used at CERN. Someday, releases I have helped manage will be deployed at NASA!

The event itself (and the tours the day after) are part of a bucket list I didn’t even know I had.

Jealous? Open Infra folks should stay tuned to Superuser for videos from event when they get posted later this month.

There’s also the upcoming Ceph day at CERN. If you find yourself in Geneva for the Ceph day or plan to visit Europe in mid-September, CERN will be open to the general public on September 14-15.

You should definitely check it out!

Photo // CC BY NC

The post Takeaways from the truly awesome OpenStack Days CERN appeared first on Superuser.

by Kendall Nelson at June 13, 2019 02:05 PM

Aptira

Open Network Integration. Part 6 – Systems Integration

Aptira Open Networking: Agile Systems

In the last post we described DevOps and how it drive highly performant and reliable systems with openness and flexibility. In this post we complete the Open Network Integration domain by describing the Systems Integration practice.

What is Systems Integration

A classic definition of Systems Integration (SI) is:

the process of bringing together the component sub-systems into one system … and ensuring that the subsystems function together as a system

Wikipediahttps://en.wikipedia.org/wiki/System_integration

Systems Integration (SI) covers the entire end-to-end solution implementation process across all components and processes. SI is about integrating multiple sub-systems and components into a cohesive whole that delivers the required overarching capabilities. The components are not only technology but also business and operational.

SI is almost always about integrating one or more vendor’s products with existing and potentially new internal systems. Given that vendors often operate in different geographies, this presents the problem of integrating multiple cultures and potentially multiple languages.

In a world in which enterprise business processes are increasingly outsourced to specialist providers, even dealing with entities within a single enterprise can involve transitioning multiple external service providers.

In summary: SI is about “multi-everything” – multi-technology, multi-component, multi-organisation, multi-process, multi-culture, etc.

We have Agile and DevOps. Why do we need Systems Integration too?

The typical Agile project is a single outcome software development project. As project scope grows, e.g. multiple vendors involved across multiple organisations, the limits to this approach become clear.

Although there are many branded methodologies that promote “agile at scale” (e.g. SaFE), they work best when there is a degree of process homogeneity across the multiple project stakeholders. This usually requires a major transformation before the project gets started.

At the scale at which Systems Integration becomes useful, we don’t see homogenous practices across the project scope. Typically we see a heterogenous mix of lifecycles and implementation approaches that have to be integrated together into a working project.

There are a number of factors that put significant stress on Agile practices and that require additional tools and practices, including:

  • Existence of formal, complex and often stringent contracts between partners
  • Much larger team sizes spread across different entities and often, geographic regions
  • Need to manage significant decisions across internal and external organisational boundaries
  • Multiple systems (internal and external to the organisation)
  • Need for major organisational change management, which brings with it conflict and political power struggles in different departments, divisions, and regions
  • Impracticality of a single Product Owner: we need to deal with multiple, even a large number of, “go to” people from both a knowledge and decision-making perspective across various organisational entities
  • More stringent and complex compliance / governance processes, due to larger size and risk, which, can lead to
    • Requirement for more detailed upfront specifications and estimates than most agile projects
    • Up-front budget & timeline estimates that are typically locked into a fairly narrow range (+/- 10% is not uncommon, depending on the organisation)

In most Open Networking projects, it is not practical to run the necessary transformations that would enable fully agile approaches across organisations. We just have to play the ball where we find it.

How do we address these factors in Open Networking Projects?

Does this mean that for Open Networking projects we need revert to full “Waterfall” as our lifecycle model? The answer is no, not completely; but we have to recognise the reality of these factors and put in place mechanisms that allow us to manage them.

These mechanisms include:

  • Stakeholder Management: using both formal and informal mechanisms
    • Identify all the decision-makers and influencers and how to engage them
    • Management of multiple “Product Owners”, actual or proxy, for requirements inputs and prioritisation
    • Successful operation of cross-organisational escalation and conflict resolution processes
    • Understanding and participation in formal governance processes
  • Full-scope dependency management: between multiple points and multiple types (e.g. technical, process, governance)
  • Precise but flexible models: of requirements, solution structure (components and interfaces), resource estimates (time, cost and people). These need to be precise, contextually rigorous, easily understood and flexible
  • Appropriately Rapid Decision-making: decision timeframes need to be shorter than change cycles

Conclusion

Overall, SI means being immensely practical and solving specifically for each project’s unique and actual circumstances. This results in a highly tailored approach for each project: Systems Integrating is not a donut-making machine. The implication of this is that we need the skills to set this custom implementation plan up and obtain agreement from all the different stakeholders. 

We need to be as flexible and agile as possible within the constraints imposed. Above all we need to carefully manage stakeholders to avoid the “culture wars” between agile and plan-based approaches or between different organisational perspectives; there is enough conflict in large projects as it is, without inventing new sources. 

That brings us to the conclusion of our description of the Open Network Integration domain. Now we have covered all three domains, plus the Software Interlude. What’s next? 

Next up we begin a series on Interoperability in the Real World, in which we examine the goal of Open Networking and how well (or not) that goal can be achieved. 

Stay tuned. 

Become more agile.
Get a tailored solution built just for you.

Find Out More

The post Open Network Integration. Part 6 – Systems Integration appeared first on Aptira.

by Adam Russell at June 13, 2019 01:06 PM

June 12, 2019

OpenStack Superuser

What’s next for OpenStack Keystone

At the Open Infrastructure Summit in Denver, project team leads (PTLs) and core team members offered updates for the OpenStack projects they manage, what’s new for this release and what to expect for the next one, plus how you can get involved and influence the roadmap.

Superuser features summaries of the videos; you can also catch them on the OpenStack Foundation YouTube channel.

What

Keystone is an OpenStack project that provides identity, token, catalog and policy services. It’s a shared service for authentication and authorization broker between OpenStack and other identity services.

Who

Current PTL Colleen Murphy, who works at SUSE as a cloud developer and Lance Bragstad, Huawei, former PTL.

What’s new

They started by sharing some metrics from the Rocky to Stein release.

“We noticed a pretty significant uptick in the number of commits — 73 percent — these are patches that were proposed, reviewed and landed during the Stein development cycle to any Keystone-related project or repository,” Bragstad says.

The team also noticed a slight uptick in the number of people who landed a patch. More commits equals more reviews, “which is why you’re seeing a 42 percent increase from Rocky,” Bragstad adds. “We did notice our core team was reduced by a third,” Bragstad says. There was also a 60 percent increase in the number of bugs opened against identity-related projects — but community members also managed to double the number of bugs squashed.

The pair outlined what the community delivered in the Stein release:

  • MFA Receipts
  • JWS tokens
  • Domain-level quota limits
  • System scope APIs
  • Read-only role

What’s next

There’s an impressive amount of work expected to deliver with the upcoming release, Train:

  • Access rules for application credentials
  • Renewable application credentials
  • Client support for MFA receipts
  • System scope policy changes were completed
  • Read-only role implementation polished
  • Immutable resources

The team also already has in sight features and improvements for upcoming releases, including:

  • Federation and edge improvements
  • Identity provider proxy
  • Hierarchical enforcement models for unified limits
  • Enhance tokenless authentication

Cross-project initiatives include adoption of unified limits,properly consuming scope types and default roles support.

Get involved

Use Ask OpenStack for general questions
For roadmap or development issues, subscribe to the mailing list openstack-discuss at lists.openstack.org and use the tag [keystone]
Participate in the weekly meeting, Tuesdays at 1600 UTC in #openstack-meeting-alt

Catch the whole 15-minute session below.

Photo // CC BY NC

The post What’s next for OpenStack Keystone appeared first on Superuser.

by Superuser at June 12, 2019 02:04 PM

Aptira

Open Network Integration. Part 5 – DevOps

Aptira Open Networking: DevOps

In the last post, we completed our description of Agile methods, and how they enable Open Networking solutions. In this post we move on to the second practice: DevOps.

The term “DevOps” evolved through multiple definitions and perspectives but in essence is an outgrowth of Google’s Site Reliability Engineering (SRE).

Google’s Site Reliability Engineering

SRE began in around 2003 when Google started hiring software engineers to run production environments. In the words of its inventor, Ben Treynor, SRE is:

fundamentally doing work that has historically been done by an operations team, but using engineers with software expertise, and banking on the fact that these engineers are inherently both predisposed to, and have the ability to, substitute automation for human labor.

Ben Treynorhttps://landing.google.com/sre/interview/ben-treynor/

For SRE as much as DevOps, automation is an aspect, but it’s not the objective.

DevOps isn’t about automation, just as astronomy isn’t about telescopes.

Christopher Littlea technology executive and one of the earliest chroniclers of DevOps

DevOps

SRE and DevOps share some foundational principles. DevOps as a term began to emerge in around 2008 or 2009, although the exact genesis is historically vague.

DevOps is a combination of software development (Dev) and information technology operations (Ops). DevOps is a set of software development practices that aim to shorten the systems development life cycle while delivering features, fixes, and updates frequently in close alignment with business objectives.

Wikipediahttps://en.wikipedia.org/wiki/DevOps

In order to improve reliability and security and provide faster development and deployment cycles, DevOps targets process areas such as: 

  • Product delivery
  • Monitoring and Measurement
  • Continuous testing
  • Quality testing
  • Feature development
  • Maintenance releases

More recent definitions of DevOps have become more broader still, with DevOps taking on an end-to-end scope, i.e. the delivery, development, and management of applications throughout the systems development life cycle, and thus it must employ a wide range of tools.

DevOps has adopted ideas from Agile, Lean, Organisation Theory, ITSM and ITIL amongst others, so there are many conceptual and practical overlaps between DevOps and Agile at a technical, tool and process level.

Tools and Toolsets

A DevOps toolset (or toolchain) is a set of tools that support one or more of the following activities: Plan, Create, Verify, Package, Release, Configure, and Monitor.

Three foundation practices for DevOps are:

  • Continuous Integration (CI): requires developers to integrate code into a shared repository frequently and verified by an automated build
  • Continuous Delivery (CDE): ensures that a team’s code is always in a deployable state, even if the code is not actually deployed to production
  • Continuous Deployment (CD): any software release that has passed upstream validation (typically CI/CDE as above) is automatically released into the production environment

These three processes underpin the end-to-end scope of DevOps. They are usually referred to as “CI/CD”. Understanding local context will determine whether the “D” is “Delivery” or “Deployment”.

DevOps Evolution

DevOps continues to evolve, as tools and practices evolve. DevOps practices are including AI and Machine Learning into their toolsets to provide more intelligent decision-making based on the data that is being collected.

For example, a common alarm set in production systems storage is to set alarms at rising levels of criticality as free space declines, e.g. 50%, 75%, 90% etc. At the point of the most critical alarm being raised (say 95% full), actions are triggered to conserve remaining space and/or free up space and/or to order more disk space. But all of these actions take time.

A DevOps approach would be to monitor usage and constantly update forecast 95% time based on the time taken for the responses, e.g. order more disk and have it installed. A DevOps approach would trigger the most critical alarm as soon as the forecast time to remediate was equal to or less than the time to reach the 95% full storage.

Conclusion

Whilst not discounting the importance of influencing and improving the overall organisational and practice context within which any given project is embedded, the Open Networking Integration view of DevOps is primarily on the end-to-end automation of lifecycle operations. Where possible within the context of any given project, Open Networking will adopt as many ideas as makes practical sense but is ready to integrate with broader DevOps practices where they may be found.

Aptira’s service delivery approach is based on the adage that “if you give someone a fish, you feed them for a meal, but if you teach someone to fish, then you feed them for a lifetime”.

Aptira’s stated preference is to teach our customers to fish.

Of necessity this can expose us to areas that are not technically part of our brief, e.g. organisational structures, human resources policies (hiring and training), business and technical processes, and even more general management areas in some cases.

However, Aptira believes in addressing the “whole patient” rather than just addressing a narrow technical specialty. We prefer to provide holistic and complementary inputs to those already existing in the customer’s organisation.

How does Aptira balance the need to deliver on specific technical requirements and at the same time be aware of and ready to help customers on a broader basis?

Through the practice of Systems Integration, which we’ll cover in the next post.

Stay tuned.

Become more agile.
Get a tailored solution built just for you.

Find Out More

The post Open Network Integration. Part 5 – DevOps appeared first on Aptira.

by Adam Russell at June 12, 2019 01:54 PM

June 11, 2019

Fleio Blog

Fleio 2019.06: two-factor authentication, volume backup and more

Fleio version 2019.06 is now available! The latest version enables you to secure the end-user and staff user accounts with two-factor authentication, clients can create volume backups and some new customization options are now available. Two-factor authentication Two-factor authentication is now available for end-users and staff users. You can let your users choose if they […]

by adrian at June 11, 2019 02:08 PM

OpenStack Superuser

Edge computing takeaways from the Project Teams Gathering

After the Open Infrastructure Summit, the Project Teams Gathering (PTG) offered time and space for developers to continue discussions they started at the Forum and prep for the upcoming release cycles by diving deep into  technical specs. This post offers recap and how you can get involved moving forward.

The Edge Computing Group met for a half-day session that included cross-project sessions with TripleO and Ironic. The group agreed on a few changes to its planned mission statement, then discussed testing the reference architecture models the group has identified to date. After multiple organizations offered hardware resources, the group decided to examine requirements to deploy reference implementations in different places.

At the TripleO session, people talked about their work on the distributed compute node architecture model and agreed the teams will try to maintain closer ties since TripleO’s testing work has a lot of overlap with the edge group’s reference architecture models. The two teams will collaborate on collecting and documenting experiences and feedback about the deployment options to make sure that the gaps are identified and addressed as well as writing up some best practices.

Bare metal came up during the joint TripleO session and the discussion ended up as one hot topics on the half-day agenda. There was a lot of interest around the L3 provisioning work, DHCP-related items and firmware updates were touched on as well. The session wrapped up with a short discussion about networking requirements and potential feature enhancements for Neutron around segmentation enhancements and L2 connectivity between locations.

StarlingX had one-and-a-half days to discuss a range of  topics from processes to release planning. Similarly to the edge group, discussions began with formalizing a project mission statement. Then it was on to more technical topics like project deliverables and deciding what the community will put effort into testing and where concepts like third-party continuous integration are applicable. During the PTG session, participants leanded towards building one reference implementation and relying on the community for variations on options for things like the underlying operating system choices while making the platform flexible.

The next topic was testing, which brought session participants to brainstorm at the whiteboard. The community still has work to do in order to build up frameworks for higher-level testing like running system tests to make sure the platform is robust, flexible and reliable. Attendees set up plans and a proposal to prioritize the work with the goal of devising a community roadmap.

That left the rest of the available time to discuss the release process and, more importantly, release content that the community can put on their radar for the upcoming two or three releases. Participants decided that StarlingX will follow the OpenStack cycle because it integrates several OpenStack components and follows some of the current OS community processes.

Discussions about features included various topics from containerization, storage to monitoring and provisioning. The session Etherpads contain notes on various topics covered, including the estimated timelines for feature releases. Some of the topics around containers were the continuation of the Forum sessions while there were further items as well like containerizing Ceph and OVS-DPDK as part of the StarlingX platform.

Last but not least, time was spent discussing outreach and onboarding new contributors. One of the follow-up tasks is to set up a First Contact Special Interest Group to form a group of people to help newcomers.

Get involved

It’s impossible to summarize six days of hallway discussions or session-long back-and-forths in a blog post! If you’d like to learn more, catch the session videos, browse the PTG Etherpads and reach out to the communities on mailing lists and IRC to follow up on any questions you may have.

Looking ahead, the next Open Infra Summit will take place in Shanghai November 4-6. The call for papers and Travel Support Program are currently open.

 

The post Edge computing takeaways from the Project Teams Gathering appeared first on Superuser.

by Ildiko Vancsa at June 11, 2019 02:05 PM

Aptira

Open Network Integration. Part 4 – Making Open Networking Projects Agile

Aptira Open Networking: Agile

In the previous post we looked the difference between “Being Agile” and “Doing Agile”. In this latest post on Agile Methods, we look at what this means for using Agile methods in Open Networks projects.

Characteristics of Open Networking Projects

Open Networking projects are challenging for many reasons and this requires that our project approach be defined specifically to address these challenges.

Open Network solutions typically contain a mix of technologies, people and organisational units that may have different perspectives on their implementation approach, especially if there are multiple vendors involved. Multiple technology streams are woven together across both software and hardware, each with its own supply chain, timelines and data models. A high level of integration means a large number of interfaces and interdependencies.

There are also likely to be many actual and potential users of the solution, especially for “generic” solutions like a cloud platform.

And typically, if a solution is more of a generic platform, this makes requirements gathering more difficult for many reasons.

These factors drive a higher level of uncertainty, both in terms of technology and end-user requirements.

Open Networks are Complicated and Complex Systems

A common analysis model used decision-making is the Stacey Matrix, which has been adapted from its original to computer systems in the following illustration:

Aptira Open Networking: Agile & Complex Systems Diagram

The characteristics of Open Networks place themselves towards the top/right quadrant of that matrix, into the “Complicated” and “Complex” solution contexts, and more likely into the “Complex” zone.

One important aspect of understanding the complexity of Open Networking projects is that they:

defy any single person’s ability to see the system as a whole and understand how all the pieces fit together. Complex systems typically have a high degree of interconnectedness of tightly-coupled components, and system-level behavior cannot be explained merely in terms of the behavior of the system components.

The DevOps Handbook

There are many implications of the attributes of complexity, not the least of which is the need for a high degree of co-operation and collaboration across multiple people in the project team. Complex systems, with high degrees of uncertainty in either or both requirements or technical baseline, have high rates of change across the entire lifecycle.

Such systems are difficult to reduce to static representations such as formal specifications or process descriptions.

The project team must constantly interact to share, reinforce and re-establish their understanding of the evolving system at any given point in time.

Agile and Open Networking

Agile (adaptive) implementation models work better in Complicated and Complex solution contexts, due to their ability to deal with change. The change is not just in the solution context, but in the changing understanding about the solution as it evolves over time amongst different stakeholders.

Participants in Open Network projects must constantly revise and re-establish their understanding of the solution overall and the relationship of their respective solution components and other components.

We closed the last post with the statement: “the mindset is more important than the methodology”.

The agile mindset is paramount in Open Networking projects, because we cannot, as a single team, fully understand or define the whole system in a static way at any one time. We need to be adaptable to the current and changing circumstances under which Open Network solutions are conceived, designed, implemented and operated.

Since we cannot fully and sustainably define the solution, rigid, rule-based and predictive approaches are very problematic. We need a different approach, and the agile mindset brings is the ability to work with heuristic principles.

Heuristics are specific kinds of tools that help individuals and teams solve problems quickly under constraints and uncertainties.

They are “rules of thumb” that inform our problem-solving processes. They are not perfect by any means; they don’t have the same level of reliability as more formal methods of problem solving. But these formal methods usually require both time and good information: resources that are often not available in the Open Networking domain.

Heuristics are:

any approach to problem solving or self-discovery that employs a practical method, not guaranteed to be optimal, perfect, logical, or rational, but instead sufficient for reaching an immediate goal.

Heuristichttps://en.wikipedia.org/wiki/Heuristic

A decision-making strategy often used with heuristics is “satisficing”, which is a combination of two words “Satisfy” and “suffice”. Satisficing is a strategy that seeks solutions or accepts choices that are “good enough” for their purposes but are not optimal.

We need heuristics in implementing Open network solutions because the rate of change is likely to be faster than the time to execute more formal processes. When we design a project approach for Open Networking projects, we must be sensitive to both the overall requirements and the individual and team situation. A key point is that the typical agile project is a software-only solution with little regard for underlying infrastructure that hosts the apps and relevant interfaces other than sizing.

In the twenty-teens, many agile projects are focused on cloud-based SaaS solutions, which can be easily and incrementally scaled. Network components and systems have not traditionally been a part of agile mainstream and present a different set of requirements throughout the whole lifecycle.

There is no point to saying “Everyone must use Scrum” or “we’ll all work on 2-week sprints” if that causes disruption on one or more areas of the overall project.

Conclusion

There are no books on “Agile in Open Networking” so whatever literature there is will inform Open Network projects with generic information that must be tailored. This requires a certain (higher) level of agile process knowledge that will be required in (at least) the early stages of the project, to ensure that tailoring the process is done in a way that promotes success but still retains the benefits of agile.

This is where Aptira can help, based on our deep and long experience in Agile approaches to complex solution integration projects like Open Networking.

We will touch on this more in future posts – Stay Tuned.

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Open Network Integration. Part 4 – Making Open Networking Projects Agile appeared first on Aptira.

by Adam Russell at June 11, 2019 01:26 PM

June 10, 2019

OpenStack Superuser

Travel Support Program launches for the Open Infrastructure Summit Shanghai

Open infrastructure runs on the power of key contributors.

If you have helped out and want to attend the upcoming Project Team Gathering (PTG) or the  Open Infrastructure Summit but need funds for travel, lodging or a conference pass, the Travel Support Program is here for you. Applications are now open for the Summit in Shanghai which takes place from November 4-6.

For every Summit, the OpenStack Foundation funds attendance for about 30 dedicated contributors from the open infrastructure community.  Contributors to projects including Kubernetes, Kata Containers, AirShip, StarlingX, Ceph, Cloud Foundry, OVS, OpenContrail, Open Switch, OPNFV are invited to apply by August 8 2019.

You don’t have to be a code jockey, either. In addition to developers and reviewers, the Support program welcomes documentation writers, organizers of user groups around the world, translators and forum moderators. (The Support program doesn’t include university students, however, who are encouraged to apply for a discounted registration pass.)

Although applying is a quick process, remember to frame your request clearly. Spend some time answering the question about why you want to attend the Summit. If you make it all about how you’d like to network or visit the town where the summit is taking place, your request is likely to get voted down.

Applications are voted on by the Travel Committee, which is made up of six people total (one board member, one Technical Committee member, one person involved with Outreachy, one ambassador, one UC member  and an OSF staffer.) The composition of the committee is refreshed for each Summit.

“The biggest common mistake people make is not conveying their value to the community,” says Allison Price, marketing coordinator at the OpenStack Foundation who has participated in vetting the applications. “Focus on what you can contribute to the discussions or pinpoint sessions that would be useful to your business and you have a much better chance.”

Asking your company to pay for part of the expenses or finding a buddy to room with won’t influence your chances of getting in. However, she does recommend at least asking if your company will cover some of the costs — because often your employer is happy to chip in and it allows the Foundation to help to more people.

Approved grantees to the Summit will be notified by August 22.

Photo // CC BY NC

The post Travel Support Program launches for the Open Infrastructure Summit Shanghai appeared first on Superuser.

by Superuser at June 10, 2019 02:05 PM

Christopher Smart

Securing Linux with Ansible

The Ansible Hardening role from the OpenStack project is a great way to secure Linux boxes in a reliable, repeatable and customisable manner.

It was created by former colleague of mine Major Hayden and while it was spun out of OpenStack, it can be applied generally to a number of the major Linux distros (including Fedora, RHEL, CentOS, Debian, SUSE).

The role is based on the Secure Technical Implementation Guide (STIG) out of the Unites States for RHEL, which provides recommendations on how best to secure a host and the services it runs (category one for highly sensitive systems, two for medium and three for low). This is similar to the Information Security Manual (ISM) we have in Australia, although the STIG is more explicit.

Rules and customisation

There is deviation from the STIG recommendations and it is probably a good idea to read the documentation about what is offered and how it’s implemented. To avoid unwanted breakages, many of the controls are opt-in with variables to enable and disable particular features (see defaults/main.yml).

You probably do not want to blindly enable everything without understanding the consequences. For example, Kerberos support in SSH will be disabled by default (via “security_sshd_disable_kerberos_auth: yes” variable) as per V-72261, so this might break access if you rely on it.

Other features also require values to be enabled. For example, V-71925 of the STIG recommends passwords for new users be restricted to a minimum lifetime of 24 hours. This is not enabled by default in the Hardening role (central systems like LDAP are recommended), but can be enabled be setting the following variable for any hosts you want it set on.

security_password_min_lifetime_days: 1

In addition, not all controls are available for all distributions.

For example, V-71995 of the STIG requires umask to be set to 077, however the role does not currently implement this for RHEL based distros.

Run a playbook

To use this role you need to get the code itself, using either Ansible Galaxy or Git directly. Ansible will look in the ~/.ansible/roles/ location by default and find the role, so that makes a convenient spot to clone the repo to.

mkdir -p ~/.ansible/roles
git clone https://github.com/openstack/ansible-hardening \
~/.ansible/roles/ansible-hardening

Next, create an Ansible play which will make use of the role. This is where we will set variables to enable or disable specific control for hosts which are run using the play. For example, if you’re using a graphical desktop, then you will want to make sure X.Org is not removed (see below). Include any other variables you want to set from the defaults/main.yml file.

cat > play.yml << EOF
---
- name: Harden all systems
  hosts: all
  become: yes
  vars:
    security_rhel7_remove_xorg: no
    security_ntp_servers:
      - ntp.internode.on.net
  roles:
    - ansible-hardening
EOF

Now we can run our play! Ansible uses an inventory of hosts, but we’ll just run this against localhost directly (with the options -i localhost, -c local). It’s probably a good idea to run it with the –check option first, which will not actually make any changes.

If you’re running in Fedora, make sure you also set Python3 as the interpreter.

ansible-playbook -i localhost, -c local \
-e ansible_python_interpreter=/usr/bin/python3 \
--ask-become-pass \
--check \
./play.yml

This will run through the role, executing all of the default tasks while including or excluding others based on the variables in your play.

Running specific sets of controls

If you only want to run a limited set of controls, you can do so by running the play with the relevant –tags option. You can also exclude specific tasks with –skip-tags option. Note that there are a number of required tasks with the always tag which will be run regardless.

To see all the available tags, run your playbook with the –list-tags option.

ansible-playbook --list-tags ./play.yml

For example, if you want to only run the dozen or so Category III controls you can do so with the low tag (don’t forget that some tasks may still need enabling if you want to run them and that the always tagged tasks will still be run). Combine tags by comma separating them, so to also run a specific control like V-72057, or controls related to SSH, just add it them with low.

ansible-playbook -i localhost, -c local \
-e ansible_python_interpreter=/usr/bin/python3 \
--ask-become-pass \
--check \
--tags low,sshd,V-72057 \
./play.yml

Or if you prefer, you can just run everything except a specific set. For example, to exclude Category I controls, skip the high tag. You can also add both options.

ansible-playbook -i localhost, -c local \
-e ansible_python_interpreter=/usr/bin/python3 \
--ask-become-pass \
--check \
--tags sshd,V-72057 \
--skip-tags high \
./play.yml

Once you’re happy, don’t forget to remove the –check option to apply the changes.

by Chris at June 10, 2019 11:25 AM

June 07, 2019

OpenStack Superuser

Edge and 5G: Not just the future, but the present

Edge and 5G are not just something we talk about anymore, as the work done at the recent Open Infrastructure Summit, Forum and PTG shows.

Once considered something only applicable in the far-off future, at the Open Infrastructure Summit Denver both use cases took center stage from the keynote and breakout presentations. They were also the focus of collaborative sessions at the Forum and Project Teams Gathering, where project teams, working groups and SIGs meet to plan and develop upcoming releases.

You can catch all the conference presentations and panel discussions online,  filtering for tracks like Edge Computing to find the ones that interest you most. For more, checkout this blog post from the StarlingX community highlighting their Summit session recordings.

Deep dive into the sessions and what’s next

There was a half-day StarlingX workshop where attendees learned how to deploy the platform and take advantage of the features on offer. At the hands-on session,participants moved at their own pace with StarlingX community members on tap as mentors if they hit a snag or had questions. More details on the workshop on the StarlingX blog.

In parallel to the Summit sessions, community members also met at the Forum. This part of the conference features working sessions to encourage further discussions between users, operators and developers to strengthen the feedback loop between folks who use the software and those who are implement it.

The OSF Edge Computing Group and StarlingX led a number of sessions at the Forum:

  • The Edge Group organized three sessions to discuss ongoing activities and plan next steps. The first covered the ongoing work to define reference architectures for edge scenarios. During the session, we briefly reviewed the current centralized and distributed models that the group has been working on. Attendees at the session agreed that the current models are relevant, but not the only suitable models  for edge deployments. The current models follow a minimalistic approach that we walked through. Action items from the session include looking into how to setup existing models in test environments while extending the models in a second phase. The additional components included employing Ironic for bare metal management and Horizon for dashboard.
  • A session on edge use cases focused on reviewing the currently available list and identifying new ones. The discussion covered how the reference architectures relate to the already identified use cases. The discussion included gaming and broadcasting as participants discussed technologies as network slicing and the challenges of end-to-end mapping. Talk also touched on items like remotely managing edge sites in an automated fashion. The conclusion of the session was that all of the identified use cases are relevant and several on the list are in focus all over the industry. Attendees also re-emphasized on the importance of reference architectures and testing activities.
  • The feedback and roadmap session covered discussions about the Edge Computing Group’s mission statement and summarized the discussions of the previous two sessions to prioritize next steps which are mainly around testing and the reference architecture work. During the session folks also talked about writing up a white paper to give an overview of the use cases the group is working on and the corresponding reference architecture work.
  • Similarly to the Edge group, StarlingX community members gathered at the Forum to discuss relevant topics and connect with users and operators for feedback and roadmap planning.
  • As StarlingX moves towards containerized control plane services and in turn offering environments where users can run mixed container or virtual machine-based workloads, a session brought into focus how to leverage Keystone in that environment. (StarlingX uses Keystone as the identity management service which can also help with the challenges of multi-tenancy in Kubernetes-based deployments.)
  • A follow-up Forum session for the above topic addressed how to run the aforementioned mixed workloads on a single node. One of the challenges is to address the different ways how OpenStack and Kubernetes solves resource management tasks. And one more Forum session was held on the path of containerization which was targeting collaboration between OpenStack Helm, Airship and StarlingX to make sure the deployment of components is smooth and flexible.
  • To further cross-project collaboration, a session at the Forum focused on packaging services. StarlingX and OpenStack contributors discussed best practices that the OpenStack project has gathered over the past couple of years to ensure that services are available for deployment on different Linux distributions. During the session, participants covered ways to store and structure manifest files to make sure to have a flexible and robust process that doesn’t require too much overhead from the community.

Get involved

StarlingX

Check out the code on Git repositories: https://git.openstack.org/cgit/?q=stx
Keep up with what’s happening with the mailing lists: lists.starlingx.io
There are also weekly calls you can join: wiki.openstack.org/wiki/StarlingX#Meetings
Or for questions hop on Freenode IRC: #starlingx
You can also read up on project documentation: https://wiki.openstack.org/wiki/StarlingX

Edge

Check out the dedicated Edge page featuring a white paper, info on the weekly WG meetings and IRC.

Photo // CC BY NC

The post Edge and 5G: Not just the future, but the present appeared first on Superuser.

by Ildiko Vancsa at June 07, 2019 02:03 PM

Aptira

Open Network Integration. Part 3 – Being Agile

Aptira Open Networking: Become Agile

In the last post we introduced the Agile approach and gave a brief history with a current status, which is pretty messy. How do we navigate this messy scene? The answer (or at least a big part of it) lies in understanding the difference between “Being Agile” and “Doing Agile”. And that’s what we examine in this post.

Agile is simple, except for all the tricky bits

A key attribute of the Agile Manifesto is that its values and principles are all expressed as heuristics: simply-phrased guidelines that help solve problems rapidly, in the right circumstances.

At one level, this approach is inspired: heuristics inform the team members and the team works out the practical detail, bound together by the principles of social cohesion and known best practices. This is how an agile team works.

In practice, however, this creates numerous issues:

  • Many people find these confusing, high-level and lacking detail. The problem starts when “more detail” becomes “more prescriptive detail” instead of “finer-grained understanding”.
  • An agile novice can confuse an expert’s guidance to “do this step, practice or technique”, with “do this and only this, always”.
  • Rules that tell people to be self-reflective and to promote process improvement are highly ineffective.
  • Many people are just more comfortable with prescriptive instructions because it sets clear accountability boundaries. “I did what you said. Don’t blame me if it didn’t work”.

Ultimately, agile teams can too easily develop a “this is how we do things around here” set of processes. Such teams extract only the codified parts of agile literature and training, and align around practices that they perform religiously, or worse, water down these tools and/or introduce processes from traditional project structures.

In this way, they achieve the status of “OINA”, which is short for “Agile in Name Only” and find that they are “doing agile” processes, but it’s not improving the result.

Why is this so?

Understanding Agile - without all the detail

Firstly, the best agile methodologies are simple by design. Consider Jeff Sutherland’s description of Scrum:

“… not a development method or a formal process, rather it is a compression algorithm for worldwide best practices observed in over 50 years of software development. The Scrum framework is simple to implement and automatically unpacks and encourages a software development team to deploy best practices …”

Jeff SutherlandThe Scrum Papers: Nut, Bolts, and Origins of an Agile Framework

There’s a lot that goes into the apparently very simple Scrum framework. All these best practice patterns and practices are compressed into this beautifully simple “shape”.

Aptira Open Networking: Agile Flow Diagram

The “shape” of Scrum is easy to understand and should be easy to do. But unpacking the patterns and practices of Scrum generates a lot of detail very quickly and expand into many domains beyond the areas of expertise or interest of the intended users, e.g. Psychology, Sociology, Anthropology to name a few.

Thinking of an agile methodology in this way is very useful, because we realise that these frameworks tap into something very powerful and generic: the natural ways in which people work together to solve problems.

It is for this reason that teams adopting an agile approach understand both the explicit process codification of the methodology as well as the underlying principles.

Agile projects are driven by the interactions between team members and other stakeholders: this fundamental aspect of agility is the most documented but probably the least understood aspect of the whole agile approach.

Being Agile – a Mindset not a Prescription

To “be agile” is to start with a high-level “mindset” and work through operational details as they work towards their goals. This relationship between the Manifesto and the “real world” context is shown below.

Aptira Open Networking: Agile Mindset Diagram

Each artefact of any given methodology is a tool to aid in that process; but it is only one instance of any number of tools that could also be applied to the same function.

Agile has been designed to be adaptive: to change and be changed by each team in order to respond to each team’s unique situation. Agile adaptiveness works at two levels:

  • Firstly, adapting the in-progress development in response to feedback from the user stakeholders about a delivered increment of capability; and
  • Secondly, adapting team processes to enhance performance by reflecting on past activities and identifying areas of improvement.

So the “agile mindset” for practitioners is the mindset that is willing to be guided by heuristics and to adapt at these two levels.

There’s an assumption that nobody gets it right first time but that by adaptive response the team will continually and sustainably improve both the developed outcomes and the team itself.

It’s very common to see questions put to well-known and reputable agile experts on very fine points of practice and to have the answer come back “here are some thoughts … but do what works for your team”.

In order to successfully guide an agile implantation, you need to have a solid understanding of all four of these aspects, not just the methodology, plus a clear understanding of the knowledge and practices that underpin them

The mindset is more important than the methodology.

So, how does this all help us manage Open Networking projects better?

We’ll tie this all up in our next post.

Stay tuned.

Become more agile.
Get a tailored solution built just for you.

Find Out More

The post Open Network Integration. Part 3 – Being Agile appeared first on Aptira.

by Adam Russell at June 07, 2019 01:47 PM

Chris Dent

Placement Update 19-22

Welcome to placement update 19-22.

Most Important

We are continuing to work through issues associated with the spec for nested magic. Unsurprisingly, there are edge cases where we need to be sure we're doing the right thing, both in terms of satisfying the use cases as well as making sure we don't violate the general model of how things are supposed to work.

What's Changed

  • We've had a few responses on the thread to determine the fate of can_split. The consensus at this point is to not worry about workloads that mix NUMA-aware guests with non-NUMA-aware on the same host.

  • Support forbidden traits (microversion 1.22) has been added to osc-placement.

  • Office hours will be 1500 UTC on Wednesdays.

  • os-traits 0.13.0 and 0.14.0 were released.

  • Code to optionaly run a wsgi profiler in placement has merged.

  • The request group mapping in allocation candidates spec has merged, more on that in themes, below.

Specs/Features

  • https://review.opendev.org/654799 Support Consumer Types. This has some open questions that need to be addressed, but we're still go on the general idea.

  • https://review.opendev.org/662191 Spec for nested magic 1. The easier parts of nested magic: same_subtree, resource request groups, verbose suffixes (already merged as 1.33). Recently some new discussion here.

These and other features being considered can be found on the feature worklist.

Some non-placement specs are listed in the Other section below.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 20 (1) stories in the placement group. 0 (0) are untagged. 3 (1) are bugs. 4 (0) are cleanups. 11 (0) are rfes. 2 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 11 microversions.

Pending Changes:

Main Themes

Nested Magic

The overview of the features encapsulated by the term "nested magic" are in a story.

There is some in progress code, some of it WIPs to expose issues:

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A spec has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound.

Cleanup

We continue to do cleanup work to lay in reasonable foundations for the nested work above. As a nice bonus, we keep eking out additional performance gains too.

Ed Leafe's ongoing work with using a graph database probably needs some kind of report or update.

Other Placement

Miscellaneous changes can be found in the usual place.

There are several os-traits changes being discussed.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

Making good headway.

by Chris Dent at June 07, 2019 12:13 PM

June 06, 2019

OpenStack Superuser

Lab-as-a-service: Crunching data with OPNFV, OpenStack and ONAP

As those lab coats crunch vast numbers and loads of data, they’re often doing it with open source. And, like everything in tech, it needs a catchy name: lab-as-a-service.

“Lab-as-a-service is a community resource, it’s an opportunity for developers and open-source users to have access to resources that they might not have. A lot of the stuff we’re working on today as a community with OPNFV, OpenStack and ONAP — and you can’t you can’t really fire those up on your laptop,” says Lincoln Lavoie, senior engineer, Interoperability Lab at the University of New Hampshire (UNH-IOL) in an interview with TelecomTV. “It provides compute resources and networking resources to those users.”

Users login, make a reservation to the resource they need for a set amount of time, then work from remote via VPN. The service is currently powered by 54 servers, both Intel based and ARM-based systems, with a “fairly large” networking setup, a combination of 10 gig or 40 gig links to the servers “so you have plenty of resources.”

In the latest 2.0 version, users can perform more design and provisioning, configuring what the network looks like between multiple servers. “So if I’ve got two or three nodes that I’m provisioning and I want specific layer 2 networks between them, I could do an actual OpenStack deployment.  You can fit all that into what you’re trying to design.”

Lavoie says LaaS is a valuable community resource to make it easier for end users to try out open-source projects. For instance, in the latest version, there’s a one-click virtual deployment button for OpenStack that can run small VNFs on one node, giving new users to OpenStack or OPNVF a chance to take a test run.  Developers, on the other hand, get access to the bare-metal system, including the lights-out management and users can do a lot of low-level checking and low-level installs “it’s a pretty awesome resource,” he adds.

For more on what features are coming next, catch his full interview here.

The post Lab-as-a-service: Crunching data with OPNFV, OpenStack and ONAP appeared first on Superuser.

by Nicole Martinelli at June 06, 2019 02:01 PM

Aptira

Open Network Integration. Part 2 – Introduction to Agile

Aptira Open Networking: Agile

In the last post we gave an overview of the Open Network Integration domain. In this post we start by doing a deep-dive into Agile methods of solution development.

Agile is a large topic, so we’ll split this into three parts.

  • Introduction (this post)
  • The difference between “Being” and “Doing” (next post)
  • Making Open Networks Projects Agile (following post)

Introduction

Agile has become very popular and mainstream in the last 5-10 years or so; there would be few organisations with dedicated IT departments who have not at least experimented with it in the last few years.

Properly adopted, the use of Agile strategies can pay big dividends in terms of cost, speed and quality of Open Network Solutions, but only if it is used appropriately, and well.

There is a lot of “hype” that can generate misunderstanding and mis-application. Successfully implementing solutions with Agile is not as simple as just adopting a methodology, e.g. Scrum, and saying “we’re agile”. There’s a lot more to being agile than following a process.

The mixed paradigm world of Open Networking means that certain aspects of the implementation process need to be different from a typical agile project.

Origins of Agile

When the 17 folks at Snowbird Utah composed the Agile Manifesto in 2001, they weren’t just codifying a new approach to software development. They were also capturing emergent aspects of how people in the 20th Century were thinking and going about solving problems.

The Manifesto did not sprout in one weekend gathering. As well as the extensive prior experience and experimentation of its creators, the seeds of agile were sown well before the Manifesto was signed.

As an example, take “Scrum”, probably the most mainstream agile methodology. The concept of a “Scrum” was coined in the 1980’s by Takeuchi and Nonaka who were inventing new models of work organisation at Toyota other places. “Scrum” as a software development methodology was formalised in the 1990’s by Jeff Sutherland, Ken Schwaber and others, and in parallel with other methodologies like XP (eXtreme Programming).

This emergent thinking was much more focused on the dynamics of how people worked together to solve problems, as opposed to the more rigid and static rules-based mechanisms used previously. By enabling a group of workers to control how work got done, this new approach was better able to respond to the accelerating business and community cycles of the time.

Not only did the group work cohesively on a complex task, but the group constantly reflected on their performance and identified how to adapt and improve over time. Toyota considered that the ability of their system to change and adapt as they learned and new circumstances arose was just as important as the system as it may have existed at any given point in time.

These two aspects are the deeply intertwined and inseparable foundations of agile practice.

Current State: Very Messy

Since 2001, the knowledge and use of agile has increased globally. Jeff Sutherland did very well to convince US businesses that “twice the work at half the price” was a very feasible objective with Scrum. 

Many businesses have adopted agile approaches, with varying levels of commitment and successThere are a number of mature high-quality options when it comes to methodology frameworks, and many sources of high-quality support and information on practice. 

However, all is not perfect in the agile world: we’ve gone from the first value in the Manifesto being “Individuals and interactions over processes and tools to a market that endlessly promotes tools and processes. There are many implementations that have not paid off as well as Scrum’s compelling but simple slogan would claim. 

Over time, the principles and values of the Agile Manifesto and its underlying theories have been eroded by the tendency to make principles more “rule-based” and prescriptive. In many projects, adherence to agile ceremony and process is more strictly policed than in traditional rule-based methodologies; in others, important aspects are ignored or treated as empty ritual. We will dive into this issue in the next post. 

Many practitioners, including a number of original Manifesto authors, lament that agile practices have become too over-adorned and that we have lost the simplicity and effectiveness of the original intent. In particular, Ron Jeffries refers to “Dark Scrum” as the result of impatient and poorly-supported Scrum implementations that don’t allow self-organisation to emerge. Martin Fowler coined the term “Agile Industrial Complex” to describe the collection of consultants, experts and managers who try to assert “best practice” rather than letting teams decide how best to do their work. 

These issues cannot all be blamed on the “Agile Industrial Complex”: practitioners in user organisations also have to own some responsibility. 

Conclusion

Agile is a very broad topic and is subject to much hype and misinformation. It is important to avoid the “Agile Industrial Complex” and other disenabling practices. Adopting an approach and reducing it to rules and processes will create more disruption than productivity. 

The important thing is to focus on being agile not just following an agile methodology. 

We’ll cover this in the next post. 

Stay Tuned. 

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Open Network Integration. Part 2 – Introduction to Agile appeared first on Aptira.

by Adam Russell at June 06, 2019 01:53 PM

June 05, 2019

StackHPC Team Blog

A Universe from Nothing: Try Kayobe in your own Model Universe

There is momentum building behind Kayobe, our deployment tool of choice for deploying OpenStack in performance-intensive and research-oriented use cases. At the recent Open Infrastructure Summit in Denver, Maciej Kucia and Maciej Siczek gave a great presentation in which they spoke positively about their experiences with Kayobe:

At the same summit, our hands-on workshop on Kayobe deployment had people queuing to get in, and received plenty of positive feedback from attendees who rolled up their sleeves and worked through the experience.

Universe from Nothing Workshop

One significant piece of feedback from the workshop was that people wanted to be able to try this workshop out at home, on their own resources, to enable them to share the experience and understand at their leisure how it all fits together.

So we added a page to the Kayobe docs for people looking to recreate A Universe from Nothing in their own time and space. As well as a step-by-step README all the scripts for creating the lab environment are provided.

Universe from Nothing Tenks

To recreate the lab, a single server is required with a fairly relaxed baseline of requirements:

  • At least 32GB of RAM
  • At least 40GB of disk
  • CentOS 7 installed
  • Passwordless sudo for the lab user
  • Processor virtualisation should be enabled (nested virt if it is a VM)

Have fun!

Universe from Nothing Logo

by Isaac Prior at June 05, 2019 10:00 PM

OpenStack Superuser

What’s next for OpenStack compute: Nova updates

At the recent Open Infrastructure Summit in Denver, project team leads (PTLs) and core team members offered updates for the OpenStack projects they manage, what’s new for this release and what to expect for the next one, plus how you can get involved and influence the roadmap.

Superuser features summaries of the videos; you can also catch them on the OpenStack Foundation YouTube channel.

What

Nova, OpenStack’s compute service. The project aims to implement services and associated libraries to provide massively scalable, on demand, self-service access to compute resources, including bare metal, virtual machines and containers. One of the oldest OpenStack projects, Nova has 223 contributors for the Stein release and, according to the last User Survey, it’s deployed by 82 percent of OpenStack users.

Who

Current PTL Eric Fried of Intel and Melanie Witt, PTL for the Rocky and Stein releases, who works at Red Hat.

What’s new

This release cycle the team focused on delivering what would impact users the most, working around a series of themes, Witt says. These included:

  • Compute nodes capable to upgrade and exist with nested resource providers for multiple vGPU types
  • Multi-cell operational enhancements: resilience to “down” or poor-performing cells and cross-cell server resize
  • Volume-backed user experience and API hardening: ability to specify volume type during boot-from-volume, detach/attach of root volume, and volume-backed server rebuild

As a result, the team delivered a ton of new features — plus a series of microversions — for the Stein release, among them:

What’s next

They’re already full steam ahead with improvements for the Train release, here’s an overview of the work in progress:

Get involved!

Use Ask OpenStack for general questions
For roadmap or development issues, subscribe to the mailing list openstack-discuss at lists.openstack.org and use the tag [nova]
Check out the Nova wiki for more information on how to get involved – whether you’re just getting started or interested in going deeper. Participate in the weekly meetings: Thursdays alternating 14:00 UTC (#openstack-meeting) and 21:00 UTC (#openstack-meeting).

View the entire 35-minute session below.

The post What’s next for OpenStack compute: Nova updates appeared first on Superuser.

by Superuser at June 05, 2019 02:11 PM

Aptira

Open Network Integration. Part 1 – The Tip of the Spear

Aptira Open Networking: Agile

Following our Software Interlude posts, which we completed with this post on Development Paradigms, we now move on to unpacking the the third and last domain: Open Network Integration.

  • Agile Methods
  • DevOps
  • Systems Integration

If you need a quick recap of the three domains of Open Networking refer to the second article in this series. Also, the first post in this series a describes the enablers of Open Networking.

The Integration Domain Overview

By integrating these technology enablers, Open Networking combines many practices, technologies, occupations and organisational units into one discipline which previously were quite distinct and separate.

The Integration domain includes not only technical integration but the process and operational integration required for these disparate practices to work cohesively so that we can define, build and operate an Open Networking solution.

Whilst this might sound easy, anyone working in the Open Networking space will have experienced the dissonance that can occur when these multiple worlds collide; we began to outline this in the Software Interlude.

The Open Network Integration domain provides a holistic view of the Open Networking solution that is being built and/or operated. Integration covers both the complete scope of functional and non-functional requirements of the solution as it is being built, and the end-to-end lifecycle view from conception through build to operations.

To the extent possible, the Integration domain fosters continuous and seamless transition through these lifecycle stages rather than the discontinuous build to operations approach that is common historically. Nonetheless we need to be able to deal with traditional approaches that may be still prevalent in many organisations.

To achieve coherent and performant solutions, we need a sharply focused perspective that leads us to success: in this, the Integration domain is literally “the tip of the spear”.

The Integration Domain Overview

To address the broad integration requirements of Open Network solutions, we include three complementary practices that extend end-to-end throughout the lifecycle of a solution.

  • Agile Methods
  • DevOps
  • Systems Integration

Here’s a quick preview of what we will cover in the remaining posts of this series:

Agile Methods and Processes

Agile methods and processes have been used for decades, long before the Agile Manifesto – they were just called something different. There are numerous documented benefits, but they are still widely misunderstood and misapplied, despite the pervasion and success of the many “brand name” methodologies (e.g. “Scrum”).

Regardless of the individual methodology, Agile builds solutions in small increments in close collaboration with end users. Agile projects move towards final functionality using an “inspect and adapt” approach. Agile fosters, and depends on, positive human dynamics (e.g. self-organisation), to facilitate communication, collaboration and to ensure a stable and coherent project team structure.

To succeed in complex “multi-everything” Open Networking projects, Agile provides more flexible and change-friendly approaches, with the primary focus of enabling end-user value creation rather than technical deliverables.

Agile practices are designed to meet these requirements and typically fill the gap between end-users or platform user stakeholders and the project team in a development project.

DevOps

The Integration domain includes DevOps as the means to highly performant production systems via automation of the entire lifecycle. DevOps is closely related to Agile but distinctly focused on the bridge between development teams and operational teams, where Agile focuses on the bridge between business teams and development teams.

DevOps practices such as CI/CD can support an Agile project and ease the delivery into production of components developed by the agile team.

Systems Integration

Systems integration is a set of practices that augment the above two elements of the Integration domain. This includes the generic practises common to all solution realisation projects, e.g. software development, Project Management, Network and Systems engineering, procurement and so forth.

Systems Integration is needed due to the unique requirements of Open Networking solutions.

Conclusion

Integrating Open Network solutions is one of the most complex forms of systems integration project. The benefits are great, but so are the challenges that must be addressed to achieve those benefits.

At its most simplistic, Open Networking is about the Integration of infrastructure and software into performant solutions to meet customer needs. At its most complex, Open Networking makes unique and significant demands of any integration practice.

The three practice enablers of the Open Networking Integration domain address these requirements.

As expert practitioners in the field of “Open Networking”, Aptira can help guide you through these considerations to a desired outcome.

We will expand on these topics and more in upcoming articles.

Stay tuned.

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Open Network Integration. Part 1 – The Tip of the Spear appeared first on Aptira.

by Adam Russell at June 05, 2019 01:47 PM

June 04, 2019

OpenStack Superuser

Inside open infrastructure: The latest from the OpenStack Foundation

Welcome to the latest edition of the OpenStack Foundation Open Infrastructure newsletter, a digest of the latest developments and activities across open infrastructure projects, events and users. Sign up to receive the newsletter and email community@openstack.org to contribute.

OpenStack Foundation news

  • Open Infrastructure Summit Denver
      • The videos from the Summit keynotes and breakout sessions are now available. Watch them now!
  •  Open Infrastructure Summit Shanghai

OpenStack Foundation Project News

OpenStack

  • OpenStack developers are chugging away on the Train development cycle. A discussion on +1 votes without accompanying comment led to a helpful reminder that we have documented reviewing guidelines inside our Project Team Guide. Please take a look if you have questions on how to review changes the OpenStack way!
  • As a way to celebrate little successes in our community, Alex Settle revived the Successes and Thanks bot. This week Ben Nemec reported solving the bandit issue that was blocking our CI, and Clark Boylan celebrated the deployment of Puppet-v4 on our infrastructure (thanks to Colleen Murphy!)
  • The documentation team is going through the final transition of decentralizing the documentation team thanks to Train PTL, Stephen Finucane. Patches are coming through thick and fast, but don’t be alarmed: OpenStack docs aren’t going anywhere and the team will still be around to support project teams. Stay tuned for more information in the coming weeks.
  • If you are running OpenStack, please log your deployment in the 2019 User Survey. The 2019 survey closes August 22 and your feedback is helpful in shaping future software releases.

Airship

  • Building upon the lessons learned from the use of the 1.0 release in production, the Airship team has started the design process for its next major release. A fundamental new component of this will be AirshipCTL, a new tool written in Go to pilot Airship deployments and upgrades. Get involved or follow along with the AirshipCTL specification.

StarlingX

  • If you’d like to catch up about StarlingX’s experience at the Open Infrastructure Summit in Denver check out the latest blog post on the website.
  • If you are evaluating StarlingX or participating in the project fill out this short survey to inform the community about your use case and give them feedback: https://www.surveymonkey.com/r/StarlingX.

Zuul

  • May 22 marked the seventh anniversary for the first public announcement of the Zuul project’s name, originally mentioned along with a summary of its initial design.
  • An article about Zuul’s confirmation as an official Open Infrastructure Project was published in Superuser.
  • James Blair posted another project update to the Zuul discussion mailing list, with details on new features under design as well as ongoing improvements in stability and performance slated for upcoming releases.

OSF @ open infrastructure community events

Questions / feedback / contribute

This newsletter is written and edited by the OpenStack Foundation staff to highlight open infrastructure communities. We want to hear from you!
If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside open infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by OpenStack Foundation at June 04, 2019 02:07 PM

Aptira

Technology Training: New Courses + EOFY discounts

Aptira Technology Training

When our Solutionauts aren’t busy building Cloud solutions, they are learning. Learning new skills, learning new technologies, learning how to build better solutions for our customers. And so should you.

With the end of financial year just around the corner, we’re offering EOFY discounts on all of our training courses. Our Solutionauts are quite flexible – and so are these deals! We’re happy to mix and match to get the best discount for you, including:

Discounts for Multiple Courses

Register for more than one course and save.

Discounts for Pre-Payment

Pre-pay for training to take place any time within the next 12 months. Bigger discounts apply if you bundle your pre-paid training with hardware, software or services.

Discounts for Software Bundles

Bundle your training courses with software licenses. For example:

  • SUSE licensing + Ceph training
  • Cloudify licensing + TOSCA training

Discounts for Hardware Bundles

Bundle your training with upgraded hardware. For example:

  • Noviflow switches + SDN/NFV training

Discounts for Service Bundles

Bundle your training with Consultancy or other Aptira services. We can build a new Cloud environment and train your staff to efficiently manage it. For example:

  • DevOps training + Consultancy services
  • Architecture Design training + Managed Cloud solution

Because our Solutionauts have been studying hard, we now have a LOT of new courses on offer, including:

Aptira - OpenStack Logo

OpenStack

OpenStack Private Cloud Administration: A 4 day intermediate course covering the fundamentals of the OpenStack open source IAAS (Infrastructure As A Service) cloud solution.

OpenStack Deployment & Advanced Administration: A 4 day intermediate course that builds on the basic OpenStack skills gained on the Private Cloud Administration course and will enhance the users knowledge with more in-depth information.

OpenStack Developer Deep Dive: A 1 day intermediate course to help individuals get on their feet with contributing upstream, and to get familiar with the OpenStack development process.

Troubleshooting OpenStack: A FREE quick reference study guide to use for troubleshooting OpenStack components, including Keystone, Nova, Neutron, Glance, Cinder, Ceilometer, Swift and Heat. A perfect last-minute study guide for people sitting the Certified OpenStack Administrator exam.

Aptira DevOps Icon

DevOps

Introduction to DevOps Tools & Practices: A 2 day beginner course to introduce the concepts, tools & practices of DevOps, version control and automation.

SDN, NFV & DevOps Concepts: A 2 day intermediate course covering how Software Defined Networking, Network Function Virtualisation and DevOps concepts come together to simplify designing, building, testing and managing networks using software components.

Introduction to Monitoring: A 2 day intermediate course introducing participants to monitoring with Grafana and Prometheus. Participants will learn the basic concepts, architecture deployment and configuration.

Ansible Essentials: A 3 day intermediate course covering all the core components of Ansible, as well as dealing with sensitive data via Ansible Vault.

Docker Container Essentials: A 3 day intermediate course covering all the core features of Docker containers. Emphasis is placed on best practices and how to secure Docker installations and containers.

Kubernetes Container Essentials: A 2 day intermediate course covering how to install and setup Kubernetes, automated deployment, scaling & management of containerised applications.

Ceph Storage Essentials: A 2 day intermediate course covering the main concepts and architecture of Ceph, its installation and daily operation in OpenStack environments.

Puppet Essentials: A 3 day intermediate course covering the essential knowledge required to master puppet. From writing your first manifests to leveraging the full toolset of the language.

Python Programming: A 5 day intermediate course providing in-depth instructor-led python programming training.

Linux KVM Virtualisation: A 2 day intermediate course covering Linux KVM, including virtualisation basics, hardware components, KVM installation, admin tools, KVM guests & advanced topics.

Aptira Open Networking Icon

Open Networking

SDN Introduction: A half day intermediate course introducing Software Defined Networking. Participants will learn about the origins, basic concepts, architecture and building blocks, as well as examples of typical SDN usage types and implementations.

NFV Introduction: A half day intermediate course introducing Network Function Virtualisation. Participants will learn about the origins, basic concepts, architecture and building blocks, as well as examples of typical NFV usage types and implementations.

SDN & NFV Introduction: A full day intermediate course introducing Software Defined Networking and Network Function Virtualisation, their relationship in future networks and their co-existence with current generation equipment and systems.

SDN, NFV & DevOps Concepts: A 2 day intermediate course covering how Software Defined Networking, Network Function Virtualisation and DevOps concepts come together to simplify designing, building, testing and managing networks using software components.

SDN, NFV & Cloud Computing Fundamentals: A 2 day beginner course introducing Software Defined Networking, Network Function Virtualisation, Cloud Computing and the relation of these technologies together.

SDN & OpenFlow Workshop: A 2.5 day intermediate workshop allowing students to get their hands dirty with some networking, SDN & OpenFlow experiments.

SDN Controllers: A 2.5 day intermediate course introducing Software Defined Networking Controllers, including OpenKilda, OpenDayLight, ONOS & Faucet.

SDN & Cloud Computing Workshop: A 1 day intermediate Software Defined Networking & Cloud Computing workshop allowing students to gain functional knowledge about the components of Cloud Computing & SDN architecture.

SDN & Service Orchestrator Integration: A 3 day intermediate course allowing students to acquire knowledge and hands-on experience designing, building & operating computer networks in an automated fashion.

TOSCA Introduction: A 2 day intermediate course introducing Topology and Orchestration Specification for Cloud applications (TOSCA) and some of its basic concepts.

Aptira Cloud Icon

Cloud / Business

Business Evolution Workshop: A customised course enabling participants to produce an outcome for their business aligned to a special objective. We focus on solutions: Agreeing to achievable deliverables as part of your team, and working with your team across the workshop to produce them.

Agile Systems Integration: This customised workshop covers all the core features of Agile Systems Integration for Open Networking projects.

Architecture Design Workshop: This customised workshop covers all the core features of architecture design utilising cutting edge technologies and defining specific technical requirements to create a high-level architecture.

Cloud Orchestration Training: This customised course covers the core features of multi/hybrid cloud Orchestration using Cloudify.

SDN, NFV & Cloud Computing Fundamentals: A 2 day beginner course introducing Software Defined Networking, Network Function Virtualisation, Cloud Computing and the relation of these technologies together.

SDN & Cloud Computing Workshop: A 1 day intermediate Software Defined Networking & Cloud Computing workshop allowing students to gain functional knowledge about the components of Cloud Computing & SDN architecture.

TOSCA Introduction: A 2 day intermediate course introducing Topology and Orchestration Specification for Cloud applications (TOSCA) and some of its basic concepts.

Aptira Linux Training Logo

Linux

Linux KVM Virtualisation: A 2 day intermediate course covering Linux KVM, including virtualisation basics, hardware components, KVM installation, admin tools, KVM guests & advanced topics.

Red Hat Linux Fundamentals: A 4 day intermediate course focusing on the fundamental tools & concepts of Linux and Unix, allowing students to gain proficiency using the command line.

Red Hat Linux Network Services: A 5 day intermediate course covering a wide range of network services with special attention paid to the concepts needed to implement these services securely and the troubleshooting skills necessary for real world administration.

Red Hat Linux Security Administration: A 5 day advanced course with a highly technical focus on properly securing machines running the Linux operating system.

Red Hat Linux System Administration: A 5 day intermediate course that explores the installation, configuration and maintenance of Linux systems.

Red Hat Linux Troubleshooting: A 5 day advanced course designed to give Linux administrators experience with both common and uncommon system problems.

Aptira MySQL Training Icon

MySQL

MySQL Cluster: A 3 day intermediate course teaching students the important details of clustering that will help them get started with MySQL Cluster.

MySQL for Beginners: A 4 day beginner course covering all the MySQL basics, allowing students to get on their way with a solid foundation of MySQL.

MySQL for Database Administrators: A 5 day intermediate course to equip administrators to use all the features of MySQL to get the most out of their web, cloud and embedded applications.

MySQL for Developers: A 5 day intermediate course for developers planning on designing and implementing applications that make use of MySQL.

MySQL Performance Tuning: A 4 day intermediate course teaching practical, safe and highly efficient ways to optimise the MySQL server.

These deals can be purchased during the entire month of June but can be used at any time during the next financial year.  

If you’re looking to refresh your hardware, need new software licenses or would like to upskill yourself or your team – let us know. We can build a customised bundle to suit your requirements – and save you $$$. If there’s a specific bundle you require for your business and we haven’t listed it above, feel free to ask. Like we said earlier, our Solutionauts are flexible – and so are our deals. 

Learn from instructors with real world expertise.
Start training with Aptira today.

Start Learning

The post Technology Training: New Courses + EOFY discounts appeared first on Aptira.

by Jessica Field at June 04, 2019 01:59 PM

June 03, 2019

Aptira

Generic API Translation

Aptira Generic API Translation

One of the key aspects of designing modern applications is to have an interface defined as Application Programming Interfaces (API) that allows businesses to securely expose content or services to both their end users and the customers.

Consider an enterprise application that needs to support different set of clients that may belong to varying domains. Each of these clients would have a set of interface requirements using which it would interact with the enterprise application. For example: An online retail store that runs varied sets of applications to sell products might need to support clients such as Mobile native application, Desktop/mobile browsers. Clients can also be API developers who develop innovative applications using the capabilities that an application expose.

Hence it is important to design an architecture that decouples the interface from the functionality, focusing more on the execution of actual business logic rather than integration mechanism. With many designers structuring their applications into a set of loosely coupled, collaborating services with each exposing an API, an API management mechanism is required that would mediate and integrate different set of clients with these services.

Another driving factor to have a mediation layer is to support the integration of applications with the legacy systems that either support proprietary interfaces or interfaces which are not yet planned to be modified due to cost associated with it. For instance : A system that fetches the data from traditional Network Management system and passes it to its API based back-end systems, needs to support an SNMP adaptation layer.

An API management layer acts as an API gateway that hides all the underlying implementation details and exposes consistent interfaces to its consumers. Following are some of the factors to be considered while designing applications using API gateway.

NOTE: In all the examples discussed below, assumption is that the applications deliver their services using the REST API.

Interface Protocol

Depending on the type of the application or the system, what is the underlying protocol over which the services are delivered to its consumers. Is it the standard HTTP/HTTPS/SOAP/LDAP or a proprietary protocol that consumers support. API gateway needs to have a protocol adaptation layer that converts consumer specific protocol to the backend API mechanisms. For example: If the consumer supports only SOAP then any request sent over this protocol should be parsed and translated to REST by the API Gateway.

Data Mapping – Representation

Most of the current applications are designed to transmit JSON payload. But legacy systems still support old data format such as ASN or XML. So, API gateway should parse this data by taking into considerations the serialization and deserialization of data, data representation i.e. ASCII or UTF-8 and translate it to JSON payload to be sent in REST API. Since any data loss would impact the backend processing, care must be taken to map the parameters from incoming data format to JSON format by retaining the data representation.

Data Mapping – Structure

After the representational and format differences are catered to in the Data Mapping – Representation function, the mapping requirements that must next be handled relate to the actual data model. Data elements are logically converted to the appropriate base type (i.e. integer, real, text, date, timestamp etc) so that they can be manipulated. The Data mapping that occurs here can be simple re-ordering or restructuring (e.g. creating new record structures), or can involve more complex functions, such as calculated items or unit conversions (e.g. imperial to metric conversions). This data mapping can be defined as simple declarative configurations, such as a field-to-field conversion map or may require more complex processing that involves procedural logic and decision processing.

Data Mapping – Semantic

Mapping the data model will satisfy most use cases but there often is a larger objective: to ensure that the underlying meaning of the data is transferred between API caller and recipient in a usable way. Although the techniques for performing semantic mapping can be similar to logical data model mapping, the problems of semantic mapping are often more subtle and wide-reaching.

For example an auto scaling policy is applied on a system when a KPI reaches a specific threshold. Each system is designed in a different way and has a different way of calculating the KPIs. Instead of projecting each system’s KPI(such as CPU load, Memory) it is easier if it can be normalized across multiple systems by creating a mapping between the common KPIs to the system specific KPI. Such a mechanism to semantically data across different systems can be designed using API gateway.

Semantic mapping uses the same underlying data manipulation capabilities as the Logical Data model mapping, but typically requires a more intensive calculation model, possibly including persisting data values between API transactions, performing lookups and more complex algorithms.

Semantic mapping is also the API mapping function that could potentially use ML/AI capabilities.

Security control

The application developers design the applications around the security framework using which they are able to provide secure services. One of the components of this framework is the User management which can span across multiple domains. Hence it is important to control access to application APIs by restricting them from direct access to sensitive data. Such access control would require security policies for authentication to be defined such as Single Sign On, API keys, OAuth 2.0, Data Masking and any other custom defined policies. Such policies are easier to manage by defining them as part of Security framework layer in the API gateway.

Traffic Control

Each of the back-end systems of the application are designed to support a specific API traffic pattern. Traffic patterns are outside the control of the application since it is dependent on the behaviour of the client applications. Any change in the API traffic pattern for the systems would impact its planned resource usage and the performance of the whole application. Since the API traffic can be from any of the client applications, it is important to control the API traffic for the whole application and also the back-end systems that deliver the services. API gateway should have a mechanism to set a threshold for the API traffic and should have appropriate error handling mechanisms to indicate such conditions to the client applications.

API Monetization

Some of the applications are designed to support end-users that are either enterprise customers or the developers who use this for their research purposes. For instance: OpenWeatherMap is a website that publishes detailed weather conditions based on the location co-ordinates by exposing an API. It is important to distinguish these sets of end-users since API traffic requirements for each one of them is different. To ensure services delivered with better Quality of Service (QoS) based on the end-user, it is important to measure the usage by having an appropriate billing mechanism. Hence an API gateway should have an API monetization platform that can bill the end-users based on the API usage.


In nutshell, with most of the current applications designed around microservices based architecture and API first principle, it’s quite evident that an API gateway that acts as a single-entry point, plays an important role for the Enterprise customers in delivering secure services to a wide spectrum of their user base.

Let us make your job easier.
Find out how Aptira's managed services can work for you.

Find Out Here

The post Generic API Translation appeared first on Aptira.

by Prashant Abkari at June 03, 2019 01:58 PM

OpenStack Superuser

Accelerating science with OpenStack

GENEVA—You never know who you might run into at the world’s largest particle physics laboratory. The hallways of The European Organization for Nuclear Research (CERN) campus are full of scientists heads down studying particle collision. During my first visit, I walked past the couple who has been at CERN for over 60 years working the kinks out of the synchrocyclotron, spotted Belmiro Moreira showing off the rack of CERN’s first OpenStack cloud and engineers climbing through the Large Hadron Collider’s ALICE experiment.

Jonathan Bryce with Belmiro Moreira, Computing Engineer at CERN with the rack of servers that has been around since CERN first deployed OpenStack in 2011.

Each hallway collision produces an unmistakable wave of energy.

In the auditorium where the discovery of the Higgs Boson was announced, the first OpenStack Days CERN gathered 200 people and an additional 180 people from 30 countries via livestream. The two-day event offered one day of talks ranging from “Supporting DNA Sequencing at Scale with OpenStack” to “The Cookbook of Distributed Tracing for OpenStack” plus another of unforgettable site visits to the ATLAS and ALICE experiments, as well as their onsite datacenter. (More on these to come!)

The crowd was a mix of researchers seeking answers to questions like: What is the composition of the universe, and how do galaxies form? How is the world’s climate evolving? What is the DNA sequence of common cancers? The rest of the auditorium was filled with software engineers building and operating the infrastructure to power this research.

Finding the answers to these questions requires years (or decades) of research producing a lot of data.

Here are a few of the organizations who presented not only their research use cases but also how OpenStack powers various workloads, including high performance computing (HPC):

  • The Large Hadron Collider (LHC) at CERN is a new frontier in energy and data volume with its experiments generating up to 88 petabytes per year in Run 2. A community of over 12,000 physicists stationed all over the world are using the LHC to answer the big questions about the evolution of the universe and rely on its computing infrastructure for this data analysis. This includes an OpenStack deployment of almost 300,000 cores across three data centers that include projects like Ironic for managing hardware and Magnum for managing their growing Kubernetes environment.
  • The Square Kilometre Array (SKA) is a collaborative effort among organizations from 13 countries to design the world’s largest radio telescope that can measure phenomenons like how galaxies merge to create stars and changes in the space time continuum by tracking stars. A project spanning a total of 50 years, the SKA project has a significant data challenge as it’s ingesting over 700 gigabytes of data per second. Stig Telfer, CTO of StackHPC discussed how OpenStack could be used to handle this HPC use case, emphasizing the importance of collaborative efforts like the Scientific SIG to advance this research.
  • The NASA Center for Climate Simulation provides computing resources for NASA-sponsored scientists and engineers. Project supported by their OpenStack environment include the Arctic Boreal Vulnerability Experiment (ABoVE), High Mountain Asia Terrain (HiMAT), and its Laser Communications Relay Demonstration (LCRD) Project. Benefits of the OpenStack private cloud deployment supporting this research includes data locality, its advantage of being a better platform for lifting and shifting traditional science codes, and OpenStack APIs provide a unifying vision for how to manage datacenter infrastructure so they can avoid creating unicorn environments.
  • With goals like transforming the research landscape for a wide range of diseases, sequencing the genomes of 25 unsequenced UK organisms, and sequencing the DNA of all life on Earth in 10 years, the Wellcome Sanger Institute supports projects with data intensive requirements. Their open infrastructure environment integrates OpenStack, Ceph, and Ansible to address their HPC use case.

“The many OpenStack deployments across multiple scientific disciplines demonstrates the results of shared design, development and collaboration,” says Tim Bell, compute and monitoring group leader, IT department at CERN, adding that the ties between open source and open science, both built on large international communities, were a recurring theme at the event.”

In the afternoon, it was time to talk vGPU and FPGA support, OpenStack and Kubernetes integration and distributed tracing providing a different perspective with OpenStack contributors sharing how the community is evolving the software to improve support for such data intensive use cases.

The use cases and technical talks illustrated the need for cross community, open collaboration without boundaries, the theme of the welcoming keynote by Jonathan Bryce, executive director of the OpenStack Foundation. Despite the complex nature of the questions that this brain trust is trying to solve, the infrastructure challenges they face are similar to those of organizations in telecom, finance or retail.

“It’s pretty interesting how even though it’s a wild use case and with all of the crazy science they do, they face the same challenges like shared file systems, scaling and OpenStack upgrades,” said Mohammed Naser, VEXXHOST CEO who presented about OpenStack vGPU support.

“We have a lot of work to do to move [research] forward, but working together will make it easier,” Bell said.

Maybe you’ll even start to find answers while walking through the hallways of the world’s largest particle physics laboratory.

Cover photo: © 2019 CERN

The post Accelerating science with OpenStack appeared first on Superuser.

by Allison Price at June 03, 2019 01:04 PM

May 31, 2019

OpenStack Superuser

The OpenStack Travel Support Program: Why face-to-face connection matters

It’s a heck of a job. Members of the travel committee were tasked with choosing just nine Travel Support Program (TSP) recipients from the 85 valuable community members who applied. The reward was seeing them at the Open Infrastructure Summit in Denver and knowing they’re contributing to the road ahead.

The committee picked a diverse group: five nationalities, of which five are Active Technical Contributors, two are Active User Contributors, four are Active User Groups members and the group as a whole contributes to 11 projects.

“The Summit and PTG are both important events for contributors to OpenStack Foundation to collaborate and build relationships,” says recipient Michael Johnson. “The program helps key members of our community attend when circumstances may not have allowed them to [otherwise].”  And while the OpenStack Foundation sets a budget to sponsor folks for every summit, the demand always exceeds request.  With support from donors, both individuals and organizations – thanks again everyone! – even more people can come together.

The Open Infrastructure community contributes from all points of the globe. Still, nothing can replace trading ideas, asking questions and swapping hacks in person.  The biggest takeaway for this group of TSP recipients? The program allowed them to meet, converse, collaborate and generate ideas in real life, in one place. For some, it means putting an IRC nickname to a face, a series of emojis with a real-life raised eyebrow. What starts with an excited handshake often leads to a heartfelt hug goodbye.

The program is part and parcel of what it means to be open. “[We] collaborate across communities, companies and cultures, bringing together people from all over the world with their experiences, their perspectives and their contributions to create something new, special and useful,” underlines Jonathan Bryce in his keynote.

If you or your organization is interested in learning more about sponsorship and how to contribute to the Travel Support Program, please contact us at summitATopenstack.org. Because after all, as Rico Lin, one of our recipients says, “It’s more than the value of the money. It’s an investment [in our contributors].”

Applications for Travel Support for Open Infrastructure Shanghai in November will open soon, stay tuned to Superuser for details.

Photo // CC BY NC

The post The OpenStack Travel Support Program: Why face-to-face connection matters appeared first on Superuser.

by Ashleigh Gregory at May 31, 2019 02:03 PM

Chris Dent

OpenStack Denver Summit Reflection

I've had "write a summary of ptg and summit" in my todo system for weeks. The summit was more than a month ago, at the end of April. I've been waiting for inspiration, some sense that I have something useful to say about the event. It hasn't come, I don't have anything noteworthy to report, but there has been change.

I started working in the OpenStack community five years ago. I'm on my third employer. The first two I quit, in large part, because of friction between two things: the community's subtextual insistence that if you wanted to be relevant (whatever that really means) you needed to be present at mid-cycles, PTGs, and summits and; employers being unwilling to support my attendance and continuous attention.

There's a great deal wrong with both sides of that situation. Both highlight some of the ways in which despite being an extremely successful (by some measures) open source project, OpenStack has also inspired a degree of toxicity and exclusivity that is anathema to some other measures of open source success.

Over the years, I think I've managed to become pretty relevant in the community, but up until very recently that has come at the (mostly self-imposed, because of a perverse sense of loyalty) cost of 60-80 hour weeks for most of those five years and never once being able to consider myself any of user, operator or deployer of OpenStack. That's not how open source ought to work. There is a boundary between me (a paid labourer creating a ton of value for corporations) and the true users, operators and deployers. In fact my obsessive occupation (and similar occupation by my peers) of any available contribution-space has made it harder for those people to participate. I've helped to solidify a profession or priesthood, but I'm thankfully a vanishing breed.

That profession, perhaps appropriate in 2015 or thereabouts, is now actively damaging to the matured OpenStack community. Because it pushes people out. We don't need professional developers, we need contributors. If we need professionals (and I think we do), those professional should be maintainers: the people who facilitate and make contributors effective.

People have been saying things like that for years (I remember explicit statements about needing to switch to a maintainership model as early as the Barcelona summit, but I'm sure it was said much earlier than that) but it is a hard change to make: Systems of privilege work hard to maintain themselves and their members.

When I got myself elected to the TC two years ago, in addition to wanting to help the community (especially with regard to keeping the TC visible) it was also a gambit to ensure a) my employability, b) attendance at summits so that I could continue to be influential in decisions (such as extracting placement from nova, which, shamefully, would have been even more glacial had I not been present-in-person).

In Denver, several people said I seemed more relaxed and less angry than they were accustomed. This was by design. "I've limited my sphere of concern", I said. I quit the API-SIG. I quit the TC. I'm now committed to making the placement project something that people will be able to contribute to when they need or want to. I'm trying to become a maintainer. A stressed-out, angry dude who more often than not wants to yell at someone for creating arbitrary roadblocks to improvement or the flow of information (my native state since birth) does not a maintainer make.

To keep to that, I'm focused on placement. Not just in the specific project, but in any project that wants to use it (in or out of OpenStack).

And I set a timer each day. For eight hours. When that timer gets to 0, I'm done for the day. Because this is still a job — I'm still a labourer helping to create a ton of value for many corporations, not just the one that pays me — and despite the superficial trends, all this automation we're so busily creating should be enabling more leisure time, not less. I'd like my health back.

Here I am contemplating being a professional maintainer. What does that mean? It probably means writing less code for features and more code (and docs) to enable other people to write features.

But more relevant to the opening of this post: it probably also means limiting required attendance to events like PTGs and summits. If not all contributors are going, then some contributors going can create an exclusive club of those in the know, those who are relevant. Those of us who have a history of being relevant in that sense have a very bad record with regard to keeping others in the info loop and decision making process (see above about anger at arbitrary info roadblocks). We need to explicitly compensate.

So: I'm wondering if it is time to stop attending the PTG and the summit. Nothing would have been significantly different if I hadn't been at Denver, despite being the PTL, the "pre-PTG" that we did was a success. If I didn't go, we'd probably do that again. We'd be inclusive of anyone with access to email, and save a ton of carbon in the process.

Would it work?

by Chris Dent at May 31, 2019 02:00 PM

Placement Update 19-21

And here we have placement update 19-21.

Most Important

The spec for nested magic has been split. The second half contains the parts which should be relatively straightforward. The original retains functionality that may be too complex. An email has been addressed to operators asking for feedback.

Those two specs represent a significant portion of the work planned this cycle. Getting them reviewed and merged is a good thing to do.

What's Changed

  • A few small refactorings plus the removal of null provider protections has improved performance when retrieving 10,000 allocation candidates significantly: from around 36 seconds to 6 seconds.

  • We've chosen to switch to office hours. Ed has started an email thread to determine when they should be.

  • Tetsuro's changes to add a RequestGroupSearchContext have merged. These simplify state management throughout the processing of individual request groups, and help avoid redundant queries.

  • Most of the code for counting (nova) quota usage from placement has merged.

  • Microversion 1.33 of the placement API has merged. This allows more expressive suffixes on granular request groups (e.g., resources_COMPUTE in addition to resources1).

Specs/Features

These and other features being considered can be found on the feature worklist.

Some non-placement specs are listed in the Other section below. Note that nova will be having a spec-review-sprint this coming Tuesday. if you're doing that, spending a bit of time on the placement specs would be great too.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 19 (-1) stories in the placement group. 0 (0) are untagged. 2 (0) are bugs. 4 (-1) are cleanups. 11 (0) are rfes. 2 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 12 microversions. No change since the last report. Note: Based on conversations that we see on reviews, explicitly trying to chase microversions when patching the plugin may not be aligned with the point of OSC. We're trying to make a humane interface to getting stuff done with placement. Different microversions allow different takes on stuff. It's the stuff that matters, not the microversion.

Pending Changes:

Main Themes

Nested Magic

The overview of the features encapsulated by the term "nested magic" are in a story.

There is some in progress code, some of it WIPs to expose issues:

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A spec has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound.

Cleanup

We continue to do cleanup work to lay in reasonable foundations for the nested work above. As a nice bonus, we keep eking out additional performance gains too.

Ed Leafe has also been doing some intriguing work on using graph databases with placement. It's not yet clear if or how it could be integrated with mainline placement, but there are likely many things to be learned from the experiment.

Other Placement

Miscellaneous changes can be found in the usual place.

There are several os-traits changes being discussed.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. As announced last week, anything that has had no activity in 4 weeks has been removed (many have been removed).

End

That's a lot of reviewing. Please help out where you can. Your reward will be brief but sincere moments of joy.

by Chris Dent at May 31, 2019 01:50 PM

StackHPC Team Blog

StackHPC at the CERN OpenStack Day 2019

With a subtitle of Accelerating Science with OpenStack, the CERN OpenStack day was always going to be our kind of conference. The schedule was packed with interesting content and the audience was packed with interesting people.

Stig had the privilege of co-presenting two projects that StackHPC have supported - with Chiara Ferrari for the SKA radio telescope, and with Jani Heikkinen for the BioMedIT project.

Stig co-presenting with Chiara Ferrari

In addition to the projects themselves, Stig promoted the OpenStack Scientific SIG as a forum for information sharing for scientific use cases.

Stig co-presenting with Jani Heikkinen

Thanks to Belmiro and the CERN team for all their effort to make the day such a great success!

by Stig Telfer at May 31, 2019 08:00 AM

May 30, 2019

Aptira

Virtual Network Function Orchestration (NFVO) using Cloudify

Aptira Virtual Network Function Orchestration Orchestration (NFVO) using Cloudify

One of our customers was building a greenfield Network Functions Virtualisation Infrastructure (NFVi) platform on a private Cloud with a full-stack configuration. The NFVi platform needs to be distributed across multiple data centers located in multiple geographic regions, so we helped them build a Virtual Network Function Orchestration (VNFO) layer using Cloudify.


The Challenge

The full-stack configuration includes a Cloud platform / Virtualised Infrastructure Manager (VIM), Orchestration, Software Defined Networking (SDN), and solution-wide alarming and monitoring. The architecture of NFVi platform was based on the guidelines set by ETSI specifications, which define a Management and Network Orchestration (MANO) architecture as a combination of Network Functions Virtualisation Orchestration (NFVO) and Generic Virtual Network Function Manager (G-VNFM) capabilities. Our customer planned to build a NFVO/G-VNFM layer across the whole platform to deploy and manage the lifecycle operations of multiple Virtual Network Functions (VNFs).

Since Aptira are Cloudify’s service and product partners in the APAC region, the customer engaged with Aptira directly to build the NFVO layer which involves engaging with vendors of different components within the NFVi stack.


The Aptira Solution

We designed and developed an Orchestration layer using Cloudify since it aligns with ETSI MANO reference architecture. To orchestrate complex VNF workloads, the following design factors were taken into consideration when building the layer:

  • Service Modelling

The VNF workloads are designed using Cloudify’s Service designer tool that generates the TOSCA templates to define building blocks of a Cloud workload such as Compute, Network, Database and others. The templates generated by the designer tool are stored as blueprints in the Cloudify Service Catalog.

  • Integration with VIM

The customer was planning to orchestrate VNF workloads across multiple OpenStack based VIM platforms such as Mirantis, Red Hat and Ericsson. Cloudify was integrated with these platforms using its in-built plugin mechanism. For VIM platforms that support proprietary OpenStack APIs such as Ericsson, a feasibility assessment of its APIs was comso that plugins can be developed for integration.

  • Integration with 3rd party V-NFM

Most of the VNF vendors provide a VNF Manager (VNFM) to manage the lifecycle of the VNFs. Since Cloudify acts as a Generic-VNFM it was integrated with the vendor’s VFNM using interfaces such as REST API/ Netconf.

  • Scalable deployment model

Since the VNF workloads to be deployed span across multiple different regions and data centers, a stable and reliable Cloudify deployment model was proposed to handle the scale of the VNF deployments.

A hierarchical model was proposed that can orchestrate workloads across regions without much dependency on customer’s environment parameters such as WAN latency.

  • Managing Lifecycle of VNFs

The customer also had VNFs that did not have a VNFM offered by the vendor. In such cases Cloudify was integrated with VNFs that support a different set of interfaces such as Netconf/YANG using its in-built plugin mechanism. This also handled the operational aspects of the VNF by performing Scale-out and Scale-in.

Considering the complexity of the NFVI platform being built, Aptira drafted a solution design to satisfy all the key architecture reference points that customer had envisioned in their overall solution architecture. Under the guidance of the customer’s internal architecture team and the expertise of Aptira and Cloudify, a detailed component level solution design was proposed and integrated into the customer’s detailed solution architecture document.

To demonstrate all the factors and adherence to the customer’s envisioned reference architecture, Aptira in collaboration with the vendor teams designed a use case using an enterprise VNF.

Since the customer was still in the initial stages of setting up their infrastructure in data centers, we demonstrated the use case by setting up the environment in their lab, mimicking their expected environment. This exercise was executed in an agile fashion, with each key design consideration demonstrated. The implementation team was primarily driven by Aptira and comprised of our engineers, Cloudify experts and executed under the guidance of the customer’s architecture team.

This engagement resulted in Aptira proposing a scalable design architecture and successful demonstration of most of orchestration related requirements in their own lab.


The Result

Since the orchestration layer acts as a primary integration point, a detailed solution architecture was a critical element in project governance that enabled customer’s architecture and onboarding team to discuss the Virtual Network Function Orchestration solution with their prospective tenants to onboard and migrate their workloads.

The client is now equipped with a detailed solution architecture for their greenfield NFVi environment which can now easily be distributed and managed across multiple data centers in multiple geographic regions.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Virtual Network Function Orchestration (NFVO) using Cloudify appeared first on Aptira.

by Aptira at May 30, 2019 11:33 PM

Pablo Iranzo Gómez

Emerging Tech VLC‘19 - Citellus - Automatización de comprobaciones

Citellus:

Citellus - Verifica tus sistemas!!

https://citellus.org

Emerging Tech Valencia 2019: 30 Mayo

¿Quién soy?

Involucrado con Linux desde algo antes de comenzar los estudios universitarios y luego durante ellos, estando involucrado con las asociaciones LinUV y <Valux.org>.

Empecé a ‘vivir’ del software libre en 2004 y a trabajar en Red Hat en 2006 como Consultor, luego como Senior Technical Account Manager, posteriormente como Principal Software Maintenance Engineer y actualmente como Senior Software Engineer para el equipo de Solutions Engineering.

¿Qué es Citellus?

  • Citellus es un framework acompañado de scripts creados por la comunidad, que automatizan la detección de problemas, incluyendo problemas de configuración, conflictos con paquetes de versiones instaladas, problemas de seguridad o configuraciones inseguras y mucho más.

Historia: ¿cómo comenzó el proyecto?

  • Un fin de semana de guardia revisando una y otra vez la mismas configuraciones en diversos equipos hicieron evidente la necesidad de automatizar.

  • Unos scripts sencillos y un ‘wrapper’ en bash después, la herramienta fue tomando forma, poco después, se reescribió el ‘wrapper’ en python para proporcionarle características más avanzadas.

  • En esos primeros momentos también se mantuvieron conversaciones con ingeniería y como resultado, un nuevo diseño de los tests más sencillo fue adoptado.

¿Qué puedo hacer con Citellus?

  • Ejecutarlo contra un sistema o contra un sosreport.
  • Resolver antes los problemas gracias a la información que proporciona.
  • Utilizar los plugins para detectar problemas actuales o futuros (ciclo de vida, etc).
  • Programar nuevos plugins en tu lenguaje de programación preferido (bash, python, ruby, etc.) para extender la funcionalidad.
    • Contribuir al proyecto esos nuevos plugins para beneficio de otros.
  • Utilizar dicha información como parte de acciones proactivas en sus sistemas.

¿Algún ejemplo de la vida real?

  • Por ejemplo, con Citellus puedes detectar:
    • Borrados incorrectos de tokens de keystone
    • Parámetros faltantes para expirar y purgar datos de ceilometer que pueden llevar a llenar el disco duro.
    • NTP no sincronizado
    • paquetes obsoletos que están afectados por fallos críticos o de seguridad.
    • otros! (860+) complementos en este momento, con más de una comprobación por plugin en muchos de ellos
  • Cualquier otra cosa que puedas imaginar o programar 😉

Cambios derivados de ejemplos reales?

  • Inicialmente trabajábamos con RHEL únicamente (6, 7 y 8) por ser las soportadas
  • Dado que trabajamos con otros equipos internos como RHOS-OPS que utilizan por ejemplo RDO project, la versión upstream de Red Hat OpenStack, comenzamos a adaptar tests para funcionar en ambas.
  • A mayores, empezamos a crear funciones adicionales para operar sobre sistemas Debian y un compañero estuvo también enviando propuestas para corregir algunos fallos sobre Arch Linux.
  • Con la aparición de Spectre y Meltdown empezamos a añadir también comprobación de algunos paquetes y que no se hayan deshabilitado las opciones para proteger frente a dichos ataques.

Algunos números sobre plugins:

- healthcheck : 79 [] - informative : 2 [] - negative : 3 [‘system: 1’, ‘system/iscsi: 1’] - openshift : 5 [] - openstack : 4 [‘rabbitmq: 1’] - ovirt-rhv : 1 [] - pacemaker : 2 [] - positive : 35 [‘cluster/cman: 1’, ‘openstack: 16’, ‘openstack/ceilometer: 1’, ‘system: 1’] - rhinternal : 697 [‘bugzilla/docker: 1’, ‘bugzilla/httpd: 1’, ‘bugzilla/openstack/ceilometer: 1’, ‘bugzilla/openstack/ceph: 1’, ‘bugzilla/openstack/cinder: 1’, ‘bugzilla/openstack/httpd: 1’, ‘bugzilla/openstack/keystone: 1’, ‘bugzilla/openstack/keystone/templates: 1’, ‘bugzilla/openstack/neutron: 5’, ‘bugzilla/openstack/nova: 4’, ‘bugzilla/openstack/swift: 1’, ‘bugzilla/openstack/tripleo: 2’, ‘bugzilla/systemd: 1’, ‘ceph: 4’, ‘cifs: 5’, ‘docker: 1’, ‘httpd: 1’, ‘launchpad/openstack/keystone: 1’, ‘launchpad/openstack/oslo.db: 1’, ‘network: 7’, ‘ocp-pssa/etcd: 1’, ‘ocp-pssa/master: 12’, ‘ocp-pssa/node: 14’, ‘openshift/cluster: 1’, ‘openshift/etcd: 2’, ‘openshift/node: 1’, ‘openshift/ocp-pssa/master: 2’, ‘openstack: 6’, ‘openstack/ceilometer: 2’, ‘openstack/ceph: 1’, ‘openstack/cinder: 5’, ‘openstack/containers: 4’, ‘openstack/containers/docker: 2’, ‘openstack/containers/rabbitmq: 1’, ‘openstack/crontab: 4’, ‘openstack/glance: 1’, ‘openstack/haproxy: 2’, ‘openstack/hardware: 1’, ‘openstack/iptables: 1’, ‘openstack/keystone: 3’, ‘openstack/mysql: 8’, ‘openstack/network: 6’, ‘openstack/neutron: 5’, ‘openstack/nova: 12’, ‘openstack/openvswitch: 3’, ‘openstack/pacemaker: 1’, ‘openstack/rabbitmq: 5’, ‘openstack/redis: 1’, ‘openstack/swift: 3’, ‘openstack/system: 4’, ‘openstack/systemd: 1’, ‘pacemaker: 10’, ‘satellite: 1’, ‘security: 3’, ‘security/meltdown: 2’, ‘security/spectre: 8’, ‘security/speculative-store-bypass: 8’, ‘storage: 1’, ‘sumsos/bugzilla: 11’, ‘sumsos/kbases: 426’, ‘supportability: 11’, ‘sysinfo: 2’, ‘system: 56’, ‘virtualization: 2’] - supportability : 3 [‘openshift: 1’] - sysinfo : 18 [‘lifecycle: 6’, ‘openshift: 4’, ‘openstack: 2’] - system : 12 [‘iscsi: 1’] - virtualization : 1 [] ——- total : 862

El Objetivo

  • Hacer extremadamente sencillo escribir nuevos plugins.
  • Permitir escribirlos en tu lenguaje de programación preferido.
  • Que sea abierto para que cualquiera pueda contribuir.

Cómo ejecutarlo?

A destacar

  • plugins en su lenguaje preferido
  • Permite sacar la salida a un fichero json para ser procesada por otras herramientas.
    • Permite visualizar via html el json generado
  • Soporte de playbooks ansible (en vivo y también contra un sosreport si se adaptan)
    • Las extensiones (core, ansible), permiten extender el tipo de plugins soportado fácilmente.
  • Salvar/restaurar la configuración
  • Instalar desde pip/pipsi si no quieres usar el git clone del repositorio o ejecutar desde un contenedor.

Interfaz HTML

  • Creado al usar –web, abriendo fichero citellus.html por http se visualiza.

¿Por qué upstream?

  • Citellus es un proyecto de código abierto. Todos los plugins se envían al repositorio en github para compartirlos (es lo que queremos fomentar, reutilización del conocimiento).
  • Cada uno es experto en su área: queremos que todos contribuyan
  • Utilizamos un acercamiento similar a otros proyectos de código abierto: usamos gerrit para revisar el código y UnitTesting para validar la funcionalidad básica.

¿Cómo contribuir?

Actualmente hay una gran presencia de plugins de OpenStack, ya que es en ese área donde trabajamos diariamente, pero Citellus no está limitado a una tecnología o producto.

Por ejemplo, es fácil realizar comprobaciones acerca de si un sistema está configurado correctamente para recibir actualizaciones, comprobar versiones específicas con fallos (Meltdown/Spectre) y que no hayan sido deshabilitadas las protecciones, consumo excesivo de memoria por algún proceso, fallos de autentificación, etc.

Lea la guía del colaborador en: https://github.com/citellusorg/citellus/blob/master/CONTRIBUTING.md para más detalles.

Citellus vs otras herramientas

  • XSOS: Proporciona información de datos del sistema (ram, red, etc), pero no analiza, a los efectos es un visor ‘bonito’ de información.

  • TripleO-validations: se ejecuta solamente en sistemas ‘en vivo’, poco práctico para realizar auditorías o dar soporte.

¿Por qué no sosreports?

  • No hay elección entre una u otra, SOS recoge datos del sistema, Citellus los analiza.
  • Sosreport viene en los canales base de RHEL, Debian que hacen que esté ampliamente distribuido, pero también, dificulta el recibir actualizaciones frecuentes.
  • Muchos de los datos para diagnóstico ya están en los sosreports, falta el análisis.
  • Citellus se basa en fallos conocidos y es fácilmente extensible, necesita ciclos de desarrollo más cortos, estando más orientado a equipos de devops o de soporte.

¿Qué hay bajo el capó?

Filosofía sencilla:

  • Citellus es el ‘wrapper’ que ejecuta.
  • Permite especificar la carpeta con el sosreport
  • Busca los plugins disponibles en el sistema
  • Lanza los plugins contra cada sosreport y devuelve el estado.
  • El framework de Citellus en python permite manejo de opciones, filtrado, ejecución paralela, etc.

¿Y los plugins?

Los plugins son aún más sencillos:

  • En cualquier lenguaje que pueda ser ejecutado desde una shell.
  • Mensajes de salida a ‘stderr’ (>&2)
  • Si en bash se utilizan cadenas como $“cadena”, se puede usar el soporte incluido de i18n para traducirlos al idioma que se quiera.
  • Devuelve $RC_OKAY si el test es satisfactorio / $RC_FAILED para error / $RC_SKIPPED para los omitidos / Otro para fallos no esperados.

¿Y los plugins? (continuación)

  • Heredan variables del entorno como la carpeta raíz para el sosreport (vacía en modo Live) (CITELLUS_ROOT) o si se está ejecutando en modo live (CITELLUS_LIVE). No se necesita introducir datos vía el teclado
  • Por ejemplo los tests en ‘vivo’ pueden consultar valores en la base de datos y los basados en sosreport, limitarse a los logs existentes.

Ejemplo de script

¿Listos para profundizar en los plugins?

  • Cada plugin debe validar si debe o no ejecutarse y mostrar la salida a ‘stderr’, código de retorno.
  • Citellus ejecutará e informará de los tests en base a los filtros usados.

Requisitos:

  • El código de retorno debe ser $RC_OKAY (ok), $RC_FAILED (fallo) or $RC_SKIPPED (omitido).
  • Los mensajes impresos a stderr se muestran si el plugin falla o se omite (si se usa el modo detallado)
  • Si se ejecuta contra un ‘sosreport’, la variable CITELLUS_ROOT tiene la ruta a la carpeta del sosreport indicada.
  • CITELLUS_LIVE contiene 0 ó 1 si es una ejecución en vivo o no.

¿Cómo empezar un nuevo plugin (por ejemplo)?

  • Crea un script en ~/~/.../plugins/core/rhev/hosted-engine.sh
  • chmod +x hosted-engine.sh

¿Cómo empezar un nuevo plugin (continuación)?

¿Cómo empezar un nuevo plugin (con funciones)?

¿Cómo probar un plugin?

  • Use tox para ejecutar algunas pruebas UT (utf8, bashate, python 2.7, python 3)

  • Diga a Citellus qué plugin utilizar:

¿Qué es Magui?

Introducción

  • Citellus trabaja a nivel de sosreport individual, pero algunos problemas se manifiestan entre conjuntos de equipos (clústeres, virtualización, granjas, etc)

Por ejemplo, Galera debe comprobar el seqno entre los diversos miembros para ver cúal es el que contiene los datos más actualizados.

Qué hace M.a.g.u.i. ?

  • Ejecuta citellus contra cada sosreport o sistema, obtiene los datos y los agrupa por plugin.
  • Ejecuta sus propios plugins contra los datos obtenidos, destacando problemas que afectan al conjunto.
  • Permite obtener datos de equipos remotos via ansible-playbook.

¿Qué aspecto tiene?

Siguientes pasos con Magui?

  • Dispone de algunos plugins en este momento:
    • Agregan data de citellus ordenada por plugin para comparar rápidamente
    • Muestra los datos de ‘metadatos’ de forma separada para contrastar valores
    • pipeline-yaml, policy.json y otros (asociados a OpenStack)
    • seqno de galera
    • redhat-release entre equipos
    • Faraday: compara ficheros que deban ser iguales o distintos entre equipos

Siguientes pasos

  • Más plugins!
  • Dar a conocer la herramienta para entre todos, facilitar la resolución de problemas, detección de fallos de seguridad, configuraciones incorrectas, etc.
  • Movimiento: Muchas herramientas mueren por tener un único desarrollador trabajando en sus ratos libres, tener contribuciones es básico para cualquier proyecto.
  • Programar más tests en Magui para identificar más casos dónde los problemas aparecen a nivel de grupos de sistemas y no a nivel de sistema sindividuales.

Otros recursos

Blog posts:

¿Preguntas?

Gracias por asistir!!

Ven a #citellus en Freenode, https://t.me/citellusUG en Telegram o contacta con nosotros:

Presentación disponible en:

https://iranzo.github.io

by Pablo Iranzo Gómez at May 30, 2019 05:30 PM

OpenStack Superuser

Tips for making your next “as-a-service” project a success

This post is a detailed write-up of a lightning talk I gave at the recent Open Infrastructure Summit in Denver. Slides are available here and a video here.

(See also the related talk, “Don’t Repeat Our Mistakes: Lessons Learned from Running Go Daddy’s Private Cloud“)

Currently I work on the TechOps team at Twilio SendGrid, which manages all our physical infrastructure and virtualization and container orchestration platforms. Before that I was at Go Daddy for about seven years where I worked on the OpenStack private cloud, as well as a couple other cloud servers products.

Overall I’ve been in the software-as-a-service industry for over 16 years, so I have experience running several different platform services over that time. I’d like to share a few tips for being successful doing that.

Starting point

As your starting line, before you build anything, define who exactly your customers or users are. Think about what their pain points are and what their expectations will be. In fact, actually go talk to them to find out! Don’t make assumptions about this. (If you have a product manager or product owner, they can help.)

This is a step we missed with the OpenStack cloud at Go Daddy. We assumed that “going to the cloud” was what everyone wanted and that it would be great. But we didn’t take the time to really understand the perspective of the users. And as you’ll see below, this led to some unexpected behaviors.

Solid foundation

A good foundation to start from is to write down what your platform does (and doesn’t) do. (Formally this is called a service contract.) It defines the demarcations between you and your users, and helps you manage expectations.

If you don’t do this, there will be a lot of bad assumptions on both sides! And you’ll become the default support for anything even somewhat related to your platform. Any problem is now going to come to you.

In the Go Daddy OpenStack cloud, any time there was any problem inside a VM, that question would come to our team to troubleshoot. Even though we really had no control over things inside the VM, or any way to influence it, the questions came to us simply because they were running on our platform.

Early adopters

Try to find some early adopters to be beta users that will give you early feedback. They should be fairly knowledgeable and experienced, and familiar with new paradigms, like “architecting for the cloud,” etc. Really invest in building that relationship, these people will be your ambassadors to others.

We had really good success with this at both Go Daddy and SendGrid.

The early adopters of our private cloud at Go Daddy used our platform to build a public-facing cloud servers product. They were able to give us a lot of great feedback and they were super easy to work with, because they understood the infrastructure and how to best use it.

And at SendGrid, our TechOps department “dog foods” our Kubernetes platform and helps to iron out any issues with new features. It’s super helpful to have that early and fast feedback loop

Iterative improvements

Make sure you’re always delivering new value by doing iterative improvements. (Agile and Scrum can help a lot with this, if you’re not already doing it.) It’s really about breaking work down into smaller chunks so you have a constant stream of little improvements.

Try to focus on the specific outcomes that you want (what you actually want to have happen) and don’t get too bogged down by the implementation details. Your goal should be to provide some sort of added value every sprint or iteration, and communicate that out through demos and announcements.

This was a challenge at Go Daddy when we were working through a project to integrate with our legacy asset management system. That project had a lot of unexpected problems and delays, and it ballooned out to a few months. The requirements were pretty vague, and the actual outcomes weren’t clear, so we were chasing a moving target, so to speak. The worst part was during this time we weren’t delivering any value to our users, which hurt our reputation.

Helping hand

Think about any paradigm shifts that you need to help your users go through. These might be hard for you to see, because you’ve already made the shift (but others haven’t yet.) A great example of this is “architecting for the cloud,” or “architect for failure” when folks have been used to highly redundant and protected physical servers.

Start by documenting and communicating best practices to your users. Involve your early adopter beta users in helping to disseminate that info as well.

This was a big challenge at Go Daddy, too, because people weren’t as willing to embrace “the cloud” architecture as we assumed. It turns out people needed a lot more training and education on this than we thought. They continued to treat things as “pets” instead of “cattle”, and they protected and hoarded their VMs.

And because of that, any time we did maintenance on the platform, it was really impactful.

Light touch

Speaking of maintenance, take your maintenances and outages seriously! Try to have as light of a touch as possible. Even small blips that don’t seem significant to you can be a big deal to your users. These can really hurt your platform’s reputation and cause people not to trust it.

Personally, this one was tough for me. After all, users should be building their apps to be cloud native and resilient to failure, right? We should be able to kill a few instances and they will get recreated, no problem. Even today at SendGrid, there are some applications deployed in Kubernetes that aren’t architected well and really shouldn’t be running there.

So in my brain, I think, “why do I have to treat these things with kid gloves?  I just want to get my upgrade done!”

But in reality these impactful maintenance episodes exposed a lot of other single points of failure and a maintenance on our platform would cause a larger outage.

In the end, this just causes more scrutiny and attention on your platform, which you really don’t want. Ultimately it just makes life worse for you. So do what you can to minimize the impact of maintenances.

Reduce the pain

When you do have to do maintenance and take outages (after all, we do have to do this sometimes) provide some good signaling to your users.

So think responding with a 503 status rather than a connection timeout. It’s a better experience and your end users can deal with that situation a lot easier. Figure out what that signaling will be, and write it in your service contract so your users are aware and can plan for it.

Also make sure you think about other systems your platform depends on.

Our Keystone was backed by Active Directory and any time Keystone wasn’t able to contact a domain controller for some reason, that had a lot of downstream effects on pretty much everything.

Another example, in an earlier cloud servers product, all the VMs were backed by NFS storage. There was an incident where both redundant network switches were rebooted at the same time, cutting us off from the NAS, which basically killed everything.

So, again, just be aware of these potential trouble spots and do your best to provide useful signaling to your users when something goes wrong.

Measure everything

This is almost always an afterthought (at least it has been for me.) Really try hard to collect as many metrics as you can from the start. Start with your best guess of what metrics will be useful, you can always adjust and change these later.

And then actually look at them! Look for trends and outliers.

When you’re troubleshooting something and you think, “I wish I could see the trending of this resource”, or “I wish we were collecting this other metric,” make note of that so you can circle back and add more useful measurements.

Another tip is to think about this from the end-user’s perspective:  API call times, latency, and error rates are the things that really matter to them. So they should matter to you, too.

Lead the capacity curve

You have to keep up with capacity! Make sure you don’t run out of space — this has happened to me more than once.

This goes along closely with metrics: know your usage patterns and know when you need to add more capacity (or clean up old stuff to reclaim space.)

If you run out of space, that really hurts your reputation. People start hoarding resources because they’re afraid you will run out of space again, and they want to make sure they have the resources they need.

I know this can be kind of hard because often this isn’t something you look at on a daily basis. But you really should! And it means real money when you need to add more. But it’s super important in order to build and keep trust in your platform.

Build backstops

But you should also build in some protections against this, primarily by setting resource quotas for your users. There are many ways to approach this. Maybe the quotas are tied to departmental budgets. Or you just give everyone a fairly large quota just to protect against run-away provisioning. But you must do something.

Even if you don’t specifically bill for capacity or are running an internal or dedicated system, you still need to do this. The purpose is to protect the integrity and trust in your platform.

Show them the money

It helps if you can show people what they’re using, and what impact that has on the platform or company as a whole. Again, even if you don’t actually bill real money for it, translate usage into a money figure for people. It’s easier to inherently understand that way.

If you don’t do some kind of “show back,” people will remain blissfully ignorant and just keep using more. It turns into a tragedy of the commons problem, where no one has any incentive to conserve and there are no consequences to using a little more.

Takeaways

Let me sum up the key takeaways:

  • Success defined by user experience
  • Manage expectations
  • Consistently deliver value
  • Lead and train
  • Constant feedback from users
  • Measure everything
  • Keep up on capacity
  • Backstop protections
  • Always consider reputation

Stay focused on these and keep in mind that the success of your platform is more than just the uptime number!

This post first appeared on Mike Dorman’s blog. Superuser is always interested in community content — get in touch: editorATsuperuser.org

The post Tips for making your next “as-a-service” project a success appeared first on Superuser.

by Mike Dorman at May 30, 2019 02:01 PM

Aptira

Segment Routing in Software Defined Networking Wide Area Networks (SDN-WAN)

While Software Defined Networking (SDN) and Virtualisation are becoming more and more popular amongst both academia and industry, there are still many challenges unsolved. One of these challenges that our client faced was demonstrating Traffic Engineering (TE) using a Segment Routing method on Software Defined Networking Wide Area Networks (SDN-WAN).


The Challenge

Segment Routing emerged in 2013 and this technology made its way to service providers and large enterprises as it contributes to network simplification by removing protocols and simplifying network operations.

Segment Routing provides network simplicity as it eliminates protocol and network operations becomes easier. It also differentiates network services over the same path. This can improve end-user experience by applying priority to different types of network traffic.

However, this technology is new and successful implementation requires experts that have studied in these fields. So, our experts in Software Defined Networking and Service Orchestration as well as Cloud engineers have collaborated to present a solution for this problem.


The Aptira Solution

To solve this challenge, Aptira believed that Multiprotocol Label Switching (MPLS) labels were the right technology to solve this problem. We set out to identify components that supported MPLS labels so we could design the solution. Segment routing using MPLS required that the selected OpenFlow/SDN switches and the SDN controller must support MPLS labels.

We identified OVS as a virtual OpenFlow switch, NoviFlow as a hardware OpenFlow switch and OpenDayLight (ODL) as the SDN controller to utilise for this solution as all components support MPLS labels. The solution also needed a Service Orchestrator (SO) to automate the segment routing process. Cloudify was chosen as the SO.

Aptira designed and implemented a set of TOSCA blueprints to implement segment routing, based on the following high-level logic:

  • The first switch in the path pushes the MPLS labels to a packet that will decide the path
  • Other switches pop the outermost MPLS labels and pass the packet to the next hop
  • The Penultimate node, pop the last MPLS label and send it to the destination of the path

The below diagram details how segment routing works in SDN-WAN: 

The blue tables indicate the rules installed on the switches in forwarding direction as follows: 

  • A packet originated from VM 1 comes through input port 2 to OVS1. This switch checks if the packet coming through port 2 is the IP packet (in the match part of the flow rule) and then pushes 3 MPLS labels to the packet, respectively 103,102,101, and then sends the packet out via port 1.
  • OVS2 receives the packet on port 2 and matches the packet against its input port and the outermost MPLS label which is 101. In the action part, it pops the MPLS label 101 and sends the packet out via port 1.
  • OVS3 receives the packet on port 2, matches the packet against its input port (i.e. port 2) and the outermost MPLS label 102, pops the MPLS label 102 as the actions set and sends it out via port 1.
  • OVS4 receives the packet on port 2, matches it against the input port and the last MPLS label which is 103 and pop the MPLS label and sends it out via output port1 to VM 2.

The orange tables indicate the rule installed on the switches in the reverse direction. This time OVS4 pushes all MPLS labels to the packets and other switches pop them out until it reaches the destination.


The Result

The segment routing solution using MPLS labeling has been successfully implemented and configured into the evaluation network. The implementation of Segment Routing over SDN-WAN provided a flexible control mechanism for traffic paths. The customer was able to set up end-to-end policies across the WAN network and its data centers. Moreover, They could simply direct the traffic on the appropriate path in the network by observing any changes in the network.

Segment routing is simple in concept but challenging to implement, and more so when you have to simulate a production network. Aptira’s world class networking skills enabled us to configure and demonstrate 4 key use cases: PCE, lifecycle, SO creating new services and TE.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Segment Routing in Software Defined Networking Wide Area Networks (SDN-WAN) appeared first on Aptira.

by Aptira at May 30, 2019 01:27 PM

May 29, 2019

OpenStack Superuser

Open Infrastructure, the virtual machine debate and open source in China: Shane Wang’s take

The post Open Infrastructure, the virtual machine debate and open source in China: Shane Wang’s take appeared first on Superuser.

by Nicole Martinelli at May 29, 2019 02:05 PM

Aptira

Analytics Platform Consulting

One of Australia’s leading Universities requires a centralised platform to provide access to large amounts of Telemetry data at different levels, with special attention paid to authentication and authorisation. They asked for our consulting and advice to provide them with the information required to select a vendor for the implementation of the system.


The Challenge

The University has several core internal IT platforms producing telemetry information. Logging and telemetry are currently handled independently for each system, making analysis difficult and time-consuming. The University requires a centralised platform for storage and analysis of system telemetry, collating data from a range of different sources.

While a solution had largely been architected by their internal technical team, the documentation required to allow open tenders had not yet been completed. The high level of the design masked many of the underlying decisions that were required to deliver a performant, scalable solution in line with University expectations and policies.


The Aptira Solution

Aptira was brought in to dig in to the details of the problem and scope a solution. By engaging closely with project stakeholders and taking the time to investigate and clearly document the solution and implementation plan, Aptira made sure the project is set up for success.

An architecture was defined for an ELK implementation (Elasticsearch, Logstash and Kibana) hosted in a local OpenStack cloud, and a new Ceph cluster to store the data. Integrations with authentication, monitoring and the initial target data sources were scoped. Over a few short days, the documentation was taken from a two-page overview to a long and detailed description of the requirements, assumptions and acceptance criteria – providing them with enough detailed documentation to proceed to the tender process.


The Result

Our key value proposition is to provide high quality consulting and independent advice across the full range of Cloud and Networking technologies. With the investigations undertaken by our team, the University is now able to approach the market to select a vendor for the implementation of the system, safe in the knowledge that it will meet their needs.

We pride ourselves on remaining independent, providing our customers with the advice and services best suited to their needs – not those that lock them into technology providers. This solution has ensured the University will not be locked into any particular technology or vendor, freeing them to select the most appropriate Vendor to suit their requirements without the restrictions that vendor lock-in often imposes.


Let us make your job easier.
Find out how Aptira's managed services can work for you.

Find Out Here

The post Analytics Platform Consulting appeared first on Aptira.

by Aptira at May 29, 2019 01:33 PM

May 28, 2019

OpenStack Superuser

OpenStack Homebrew Club: Swift in the closet

OpenStack powers more than giant global data centers. This is the first in a series that highlights how Stackers are using it at home.

Here we talk to John Dickinson, one of OpenStack’s longest-running project team leads. In addition to his work with Swift was recently given the “Keys to Stack City” Community Contributor Award in recognition for his dedication over the years.

Here he talks about why a Swift cluster is taking up valuable closet space in his San Francisco home and what he plans to do with it next.

When did you start using OpenStack at home and what are you doing with it?

I first started running OpenStack Swift at home about a year ago.

I’ve got a small setup with five nodes in a Swift cluster.  I use it to store backups from a bunch of different computers and store movies that it can then stream to my TV. And I use it for some sharing between people and machines. My kids like to record videos of themselves playing games and stuff like that. They can put it on the Swift cluster, use their own account and then make a little home web page and watch themselves stream.

I’m going to start a new project on making some air sensor monitors and put it throughout the house. And I’ll store that data in Swift.

So it’s part smarthome, part family media vault.  Where do you keep it?

I did take over a closet. Not everyone in the family was happy about that, but that’s where it is.

Do you have any plans to do anything with it next?

The next thing I’m going to start is a smarthome experiment. You can store lot of time-series sensor data in a Swift cluster.

And I need to expand the usage, especially backup usage across several different more computers, so instead of  doing something like Time Machine I can back up directly to Swift using a couple of different backup clients on Windows, Linux or Mac computers.

What advice would you give to somebody trying to do this? “Don’t try this at home? Or do this at home?”

It’s absolutely possible to do it at home. And it doesn’t take much; it’s not expensive compared to an equivalent file server or something like that.

In fact, the reason I started using Swift is because my old file server finally died after about 10 years. I wanted something that was a little bit better. The main problem I had with the old file server was the lack of expandability over time. I couldn’t just add a new server to the file server cluster or add a hard drive when there are no more available ports on the motherboard.

I know how Swift works, so I figured I’d use it to store all of that data. It’s absolutely possible to get started. I’m using really cheap low-powered ARM single-board computers but you can go bigger than that. I chose to have several individual servers that are each connected to just one hard drive. It’s super lightweight, but super flexible. You could do something with a slightly bigger server that has several hard drives. And if that’s what you have, that would totally work as well.

What’s the ballpark cost?

Well, of course it depends on what size hard drive you get, I think I ended up spending about $70 each on a hard drive. So four of those plus roughly $50 for each of the single board computers, on average. So you’re looking at around $500 or $600 total. With the current setup, I’ve got a fully redundant scalable storage system that has about two and a half terabytes of usable storage and can store over 2,500 movies.

How did you figure out the right scale?

It was primarily budget and then thinking about what I needed. I didn’t want to spend $1,000 or $2,000 on a huge storage system. Starting from there, I researched the computers and hard drives that would meet my storage needs in the price range I was looking for.

What else do people need to know?

The hard part about any kind of storage system is figuring out operations. I’m just monitoring it manually right now.  I’m still in the process of building some of that stuff out, but it doesn’t take lots of daily care and feeding or anything like that. I forget it for awhile and then pick it up later. And it works just fine.

The idea of a smart home/spy terrifies me. What are you tracking with it?

I’m not going to put microphones or anything like that all over my house.  I’m going to install air sensors to start with because I’m curious to know more about where we live — it’s a 100-year-old house without central AC. I’m interested in finding out what happens when at night you’re sleeping with all the doors and windows closed. What’s the co2 level? What’s the temperature, humidity? Do we need to add air purifiers or something?

Any other plans?

Once I have that infrastructure setup, I want to talk more stuff, expose a lot more about more the clusters, especially cluster health through the same kind of visualization. I want to expose a lot of network activity, I’ve got fairly complicated network setup with different networks — my kids, myself — and I want different networks for different smart devices. No reason my washer and dryer need to be on the same network as my work computer.

My goal is to track and monitor all that with pretty graphs on the screen and then maybe make some more decisions — in terms of energy use and trying to figure out where things are.

I do have a another kind of dream pet project, a continual video camera that points at the sky…to track the constellations and see how the weather changes, the fog, the city lights… It’s just kind of a fun thing to play with.

You brought Swift into your home. Do you ever stop working?

I don’t spend a lot of time on it! I just knew it was something that I could do, and then you run with it for six months.  At the same time, it’s cool because especially with the files and stuff, the idea that you lose all of your family photos, or some video that your kids made, that’s always a fear when you keep it on a single hard drive somewhere. I’m less worried about that now because I’ve got Swift. I know how that works and that I can trust it.

 

Got an OpenStack homebrew story? Get in touch: editorATopenstack.org

Photo // CC BY NC

The post OpenStack Homebrew Club: Swift in the closet appeared first on Superuser.

by John Dickinson at May 28, 2019 02:05 PM

Aptira

Creating L3 Virtual Private Networks (VPN) Tunneling Between Two Sites

A global player in the telecommunications market requires Layer 3 Virtual Private Networks (VPN) tunneling to send traffic between two sites.


The Challenge

This Telco has asked us to implement L2/L3 VPN tunneling for sending network services to communicate between two OpenStack instances. A transport network is currently in use to connect these two sites together, with the transport network and sites equipped with OpenVSwitches for sending traffic to physical or virtual endpoints on the network.

As this challenge was a combinational problem and the internal staff of our customer didn’t have the required expertise, Aptira was asked to provide a solution. So, our experts in Software Defined Networking and Service Orchestration as well as Cloud Engineers got together to present a solution.


The Aptira Solution

To solve this problem, Aptira decided to use Virtual Extensible LAN (VXLAN) technology. VXLAN allows us to segment the network at scale to supporting a very large numbers of tenants. It also enables us to dynamically allocate resources within or between data centers without being constrained by Layer 2 boundaries. Another constraint of L2 tunneling is that forwarding based on Ethernet addresses sometimes do not scale sufficiently, whereas L3 VPNs are available throughout the globe on international links.

One of the challenges was the creation of VXLAN tunnel from an edge OVS machine on the OpenStack deployment to the OVS’s on a network outside the OpenStack deployment and then from there to the edge OVS located on another OpenStack site. Another challenge was the provisioning of the VXLAN tunnel and automating the service creation which can be reproducible on similar scenarios.

To solve this problem, we used:

We created TOSCA blueprints for creation of transport network between these two sites as well as the creation of a L3 VPN tunnel. Creating nodes, bridges and tunnels on the transport network were automated via Cloudify. Cloudify communicates to ODL via RESTAPI, while ODL provisions the creation of these services, providing topology and stats information to upper layer if requested.

As a part of tunnel creation, a VXLAN tunnel would be created between all OVS’s, including the edge switches on the transport network and the OVS’s on the OpenStack deployment.

Another part of implementation is the integration between ODL and OpenStack, as well as updating OpenStack rules to allow communication from the virtual machines on one OpenStack deployment to the virtual machines on the other one via the VXLAN tunnel.


The Result

Aptira’s implemented solution for creating a VXLAN tunnel allowed the customer to route the network traffic between the two OpenStack deployment securely without any intervention of human being while allowed the customer to provision the network services in real-time.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Creating L3 Virtual Private Networks (VPN) Tunneling Between Two Sites appeared first on Aptira.

by Aptira at May 28, 2019 01:31 PM

May 27, 2019

Aptira

Passing the Certified OpenStack Administrator (COA) Exam

Aptira OpenStack Certified OpenStack Administrator COA

One of Australia’s Government owned service providers is moving their network operations to OpenStack. They’d like to continue managing their Cloud platform internally but do not currently have the internal expertise to manage this on their own.


The Challenge

Once the client had begun moving their network operations onto the new OpenStack platform to run their network functions, they quickly recognised that they lacked the in-house OpenStack expertise required to manage it efficiently. The client turned to Aptira to help the team increase their OpenStack skills and pass the Certified OpenStack Administrator (COA) Exam.


The Aptira Solution

Aptira provided a full week of on-site OpenStack training to their team. This course provided all the technical knowledge that they need to pass the COA Exam, including:

  • What is OpenStack
  • An overview of all OpenStack projects, including Nova, Neutron, Glance, Cinder, Ceilometer, Heat and Swift
  • OpenStack Architecture
  • Virtual Machine provisioning walk-through
  • Horizon overview
  • Keystone architecture, including user management and keystone CLI
  • OpenStack message queue configuration
  • Glance Image management CLI and creation of custom images
  • Cinder Storage CLI and managing volumes
  • Linux Virtualisation basics, including hypervisors, KVM and Linux bridges
  • VM placement and provisioning
  • Instance management, including Nova CLI, boot/terminate instance and attaching a volume to an instance
  • Networking in OpenStack, including Nova-Network Vs Neutron, Neutron architecture and plugins, OpenVSwitch and Neutron agents
  • Managing networks, subnets, routers, ports and floating IPs
  • Ceilometer background, use cases, architecture, meters, pipelines and deployment
  • Heat architecture, services and configuration
  • Swift architecture, accounts, node types, partitions, zones and replication
  • Using Swift accounts, creating and managing objects, object server management, container server management, account server management, proxy server management, ring management and large objects

We also ran through a series of hands-on labs to give them the experience they will require to manage their in-house OpenStack platform:

  • Health checks
  • Test instance creation
  • Creating and managing users, roles, tenants and quotas, images and volumes
  • Check messaging
  • Configuring flat networking
  • Creation and management of VMs
  • Configuring VM metadata
  • Creating routers, networks, subnets
  • Associating floating IPs
  • Troubleshooting Neutron networking
  • Working with Ceilometer
  • Online installation – How to install OpenStack with internet access
  • Creating a stack

The Result

The training was delivered to 19 students by an on-site Aptira engineer. The class was split into two sessions (10 in one class and 9 in anther) so the engineers can provide more personalised training to a smaller group. All 19 students have successfully completed the course and significantly enhanced their OpenStack knowledge. As a result, they can efficiently manage their in-house OpenStack network operations and confidently sit for the COA exam.

We’ve also put together a quick OpenStack guide to help with last minute studying for the Certified OpenStack Administrator Exam. The last chance to sit for the COA exam is September 15 2019, so if you require assistance with bringing your OpenStack skills up to speed before then – reach out to us. We offer customised, online and on-site group courses for a range of technlologies in addition to OpenStack. Including DevOps, Kubernetes, Ansible, Docker, KVM, Ceph, Puppet, SDN, NFV, Linux, TOSCA, Grafana, Prometheus, MySQL, Python and more. If your organisation needs to focus on particular technologies, or needs unusual learning outcomes (eg sales/presales enablement, development techniques for cloud native applications) then Aptira can provide you with an unbiased understanding.


Learn from instructors with real world expertise.
Start training with Aptira today.

View Courses

The post Passing the Certified OpenStack Administrator (COA) Exam appeared first on Aptira.

by Aptira at May 27, 2019 01:01 PM

Stephen Finucane

Trading Flexibility for Performance: The HPC Story in OpenStack

This lightning talk was presented at OpenStack Days CERN in May 2019. It gave an overview of the state of the art for HPC and NFV in the OpenStack Compute.

May 27, 2019 12:00 AM

May 24, 2019

OpenStack Superuser

What’s next: 5G network slicing with ETSI OSM 5 and OpenStack

Network slicing is an innovative network architecture technology that’s also one of the most exciting promises of 5G telecom networks. Imagine a single spectrum of the network that can be divided logically for different use cases with specific characteristics and network requirements like latency, high bandwidth, security, etc.

Plus, services in those different slices can be managed separately, enabling privacy and dedicated bandwidth to run critical operations. The technology also offers huge performance boosts for specific use cases, such as industrial internet of things, autonomous driving, and more.

There are various annotations about the standardization about NS from various vendors. However, currently, various proof-of-concepts are undergoing to test the operations, performance and basic and ready-to-run architectures for NS.

PoCs and analysis conducted in communities like OpenStack and ETSI OSM have come to some conclusions around network slicing and its readiness. Experts note that end-to-end virtualization, dynamic centralized orchestration and quality-of-service manageability are basic requirements for successful NS implementation.

Last year at the OpenStack Summit in Vancouver, architects Curtis Collicutt and Corey Erickson evaluated OpenStack networking projects for network slicing implementation. More recently at the Mobile World Congress 2019, Telefonica and Telenor collaborated to demonstrate orchestration of network services like enhanced mobile broadband (eMBB) and ultra-reliable low-latency communications (URLLC) in network slicing environments.

Let’s take a look at the capabilities of Open Source MANO (OSM) and OpenStack for network slicing.

Orchestrating 5G network slices with OSM

The latest release 5 of ETSI OSM came with major enhancements to support network slicing features. OSM has integrated slice manager and extended information model for network slicing template (NST) and network slice instance (NSI).

Having a common information model is a vital feature of OSM. Modelling across different entities like network function packages (VNF, PNF and hybrid NFs), network service packages and network slices packages help to overcome complex network operations for repetitive function and drastically simplify and automation daily operations. OSM network slice feature with its IM allows network services to stay self-contained and agnostic to technology infrastructure for completely different network characteristics in each service.

The proof-of-concept demo included the deployment of two network slices with some input parameters and operating them through day-2 operations at the network slice level.

Deployment

  • Each slice is modeled as a set of network services connected by networks or VLDs
  • Simple input parameters determine the type of slice to be created on demand
  • The two slices share some network services (shared NS Subnets)
    • If the shared NS has already been deployed, it won’t be re-deployed
    • It will be reused, but the initial configuration for the second network slice can still be done in the shared NS to let it know that new elements are present.

Operation

Running day-two primitives at Network Slice level (handled as a single object)

  • OSM, behind the scenes, maps them to a sequence of calls to NS primitives, which, in turn, are derived in calls to VNF primitives

Here’s a graphic of how it works. Slice one is dedicated to enhanced mobile broadband (eMBB) and slice two is for enhanced mobile broadband (eMBB) use cases.

Figure – Network Slice Orchestration with OSM

OpenStack support for network slicing

In the last two years, OpenStack has focused on support to fulfill orchestration and infrastructure management requirements for telco cloud. The software contains various projects across networking and compute domains that can be utilized for various aspect required in network slicing.

As discussed in a session by Interdynamix architects, OpenStack can be focused to satisfy quality of service, isolation, segregation, sharing networks and automation/orchestration via Neutron APIs.  OpenStack, in this use case, is mainly highlighted for policy and scheduling features in its projects like Neutron for networking and Nova for compute. OpenStack’s group based policy (GBP) is suggested to be used for network slicing. It can be responsible for enabling self-service automation and application scaling, separation of policies for slices, managing security requirements, etc.

Complementary OSF projects like StarlingX and Zuul also have capabilities that align to support 5G network slicing.

About the author

Sagar Nangare is a technology blogger, focusing on data center technologies (networking, telecom, cloud, storage) and emerging domains like edge computing, IoT, machine learning, AI). He works at Calsoft Inc. as a digital strategist.

The post What’s next: 5G network slicing with ETSI OSM 5 and OpenStack appeared first on Superuser.

by Sagar Nangare at May 24, 2019 02:01 PM

Aptira

Platform and Service Assurance in a Network Functions Virtualisation (NFV) Platform

One of the biggest challenges in building a Network Functions Virtualisation (NFV) platform is reducing OPEX costs and bringing flexibility to the platform. Most vendors offer monitoring tools, however many tools don’t have the visibility to detect issues that are taking place within other components, requiring the use of multiple systems, increasing cost and reducing platform flexibility.


The Challenge

Platform agility is only possible if there is complete operational visibility across following components in NFV stack:

  • Virtualized Network Functions (VNF)
  • Virtualized Infrastructure Manager (VIM) where VNF workloads are deployed as either VMs/Containers
  • Hardware/Infrastructure layer – Racks, Bare metal nodes
  • Network Layer – Switches, Routers, SDNs

Most of the vendors offer different suites of monitoring tools in each component in order to ensure operational and production readiness of the layer that they are operating in. For instance, each VNF vendor rolls out a Virtual Network Functions Manager (VNFM) that handles life cycle events e.g. self-remediation of a service in VNF should it encounter a problem. However, this VNF-specific monitoring tool doesn’t have visibility of the issues that are occurring in any other components. Problem diagnosis requires an operator to interrogate multiple systems. This means multiple UI’s, multiple monitoring models and multiple views and / or reports.


The Aptira Solution

A centralised Service and Platform Assurance system is required to integrate with multiple heterogenous components. This solves the lack of complete visibility of the whole Network Functions Virtualisation platform across different Network Functions Virtualisation Infrastructure (NFVi) Points of Presence (PoPs). Implementing such a centralised system requires identifying all the failure domains in platform, their critical data points and mechanism to extract these data points.

So, key responsibilities of the system include:

  • A data collection mechanism to collect data points such as performance metrics, usage
  • A policy framework that defines a set of policies to correlate the data collected and perform corrective actions by detecting anomalies
  • A single dashboard view that gives the information of all KPIs in the system such as Alarms, Capacity, Impacted services/components, Congestion

A representation of such a system is shown below:

Aptira - NFV Network Function Virtualisation

Aptira solved this for a large Telco by developing a framework using Open Source tools – TICK Stack and Cloudify. TICK stack was selected due to its wide community support and the stack’s existing integrations with 3rd party software components. Cloudify was selected because of its ability to handle CLAMP policies at scale across NFV domains.

TICK Stack uses Telegraf as its data collection component that uses a wide range of plugins to collect different set of data from multiple sources. Aptira used REST plugins to fetch the data from components such as VNFM, OpenStack, Kubernetes endpoints and used SNMP plugins from legacy VNFs. Once data is collected, they are then stored in InfluxDB database for further analysis.

TICK Stack uses the Kapacitor component for defining event management policies. These policies correlate events/data collected and triggers a corrective action. Aptira designed and implemented policies that acted on data collected from OpenStack endpoints and the VNF telemetry data to detect anomalies and triggers a remediation plan. For example, detecting that a VNF is unhealthy (e.g. due to high CPU load/throughput) and triggering a remediation process (e.g. Auto-Scale to distribute high-load across more instances of the VNF).

Since a VNF is modelled and orchestrated by Cloudify NFVO, Kapacitor policies interacts with Cloudify to perform a corrective action at a domain level, such as rerouting all the traffic being sent to the affected VNF to another VNF, and thereby applying a close loop control policy.


The Result

To have a complete visibility of the platform and the services that are running on them, it is important to have a subsystem integrated in Network Functions Virtualisation platform that not only ensures the uptime of the components, but also provides enough information to the Operations team to identify anomaly patterns and provide a quick feedback to the teams concerned.

These Open Source tools have enabled us to provide the required visibility into their NFV platform, reducing the customer’s OPEX costs and increasing flexibility of their platform.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Platform and Service Assurance in a Network Functions Virtualisation (NFV) Platform appeared first on Aptira.

by Aptira at May 24, 2019 01:25 PM

Chris Dent

Placement Update 19-20

Placement update 19-20. Lots of cleanups in progress, laying in the groundwork to do the nested magic work (see themes below).

The poll to determine what to do with the weekly meeting will close at the end of today. Thus far the leader is office hours. Whatever the outcome, the meeting that would happen this coming Monday is cancelled because many people will be having a holiday.

Most Important

The spec for nested magic is ready for more robust review. Since most of the work happening in placement this cycle is described by that spec, getting it reviewed well and quickly is important.

Generally speaking: review things. This is, and always will be, the most important thing to do.

What's Changed

  • os-resource-classes 0.4.0 was released, promptly breaking the placement gate (tests are broken not os-resource-classes). Fixes underway.

  • Null root provider protections have been removed and a blocker migration and status check added. This removes a few now redundant joins in the SQL queries which should help with our ongoing efforts to speed up and simplify getting allocation candidates.

  • I had suggested an additional core group for os-traits and os-resource-classes but after discussion with various people it was decided it's easier/better to be aware of the right subject matter experts and call them in to the reviews when required.

Specs/Features

  • https://review.opendev.org/654799 Support Consumer Types. This is very close with a few details to work out on what we're willing and able to query on. It's a week later and it still only has reviews from me so far.

  • https://review.opendev.org/658510 Spec for Nested Magic. Un-wipped.

  • https://review.opendev.org/657582 Resource provider - request group mapping in allocation candidate. This spec was copied over from nova. It is a requirement of the overall nested magic theme. While it has a well-defined and refined design, there's currently no one on the hook implement it.

These and other features being considered can be found on the feature worklist.

Some non-placement specs are listed in the Other section below.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 20 (-3) stories in the placement group. 0 are untagged. 2 (-2) are bugs. 5 are cleanups. 11 (-1) are rfes. 2 are docs.

If you're interested in helping out with placement, those stories are good places to look.

On launchpad:

osc-placement

osc-placement is currently behind by 11 microversions. No change since the last report.

Pending changes:

Main Themes

Nested Magic

At the PTG we decided that it was worth the effort, in both Nova and Placement, to make the push to make better use of nested providers — things like NUMA layouts, multiple devices, networks — while keeping the "simple" case working well. The general ideas for this are described in a story and an evolving spec.

Some code has started, mostly to reveal issues:

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A spec has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound.

Cleanup

As we explore and extend nested functionality we'll need to do some work to make sure that the code is maintainable and has suitable performance. There's some work in progress for this that's important enough to call out as a theme:

Ed Leafe has also been doing some intriguing work on using graph databases with placement. It's not yet clear if or how it could be integrated with mainline placement, but there are likely many things to be learned from the experiment.

Other Placement

Miscellaneous changes can be found in the usual place.

There are several os-traits changes being discussed.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Starting with the next pupdate I'll also be removing anything that has had no reviews and no activity from the author in 4 weeks. Otherwise these lists get too long and uselessly noisy.

End

As indicated above, I'm going to tune these pupdates to make sure they are reporting only active links. This doesn't mean stalled out stuff will be ignored, just that it won't come back on the lists until someone does some work related to it.

by Chris Dent at May 24, 2019 12:07 PM

May 23, 2019

OpenStack Superuser

What’s next for Zuul

DENVER —  At the first Open Infrastructure Summit, continuous integration/continuous delivery project Zuul crossed the threshold to independent project status along with Kata Containers.

After graduation, now what? James Blair, Zuul maintainer and office of the CTO, Red Hat says that like all new grads, the project gating system is asking a lot of important “What if?” questions:

  • What if we make this change?
  • What if we upgrade this dependency?
  • What happens to the whole system if this micro-service changes?
  • What if the base container image changes?

Zuul is a project under the OSF’s CI/CD strategic focus area. The community is busy adding new features but one especially worth focusing on is changing the way people develop containerized software.

Zuul is more than CI/CD, Blair says. It’s a new way of testing that gives developers freedom to experiment, is called speculative execution. “We’ve done it for years with Git, but now we can do it with containers,” he adds.

Here’s why that’s important.

Container images are built on layers. In the stacked image system, the registry is intermediary. This can lead to images in production with inadequate testing and upstream images breaking downstream. “We want to know that changes to base images won’t break the images composed on top,” he says.

With Zuul speculative container images, registries are created as needed. Registries are ephemeral, lasting only as long as each test. Every deployment sees its future state, including the layers it depends on. And this process is invisible to deployment tooling. Test jobs use Zuul’s registry and speculative or production images. When images have been reviewed and pass tests, they’re safe to promote to production.

A key design point: You can use actual production deployment tooling in tests. Zuul speculative container images makes testing more like production, not the other way around. Zuul allows its users to move fast without fear, because its speculative execution feature allows them to find issues and verify solutions in complex systems before committing a single change to production.

Catch the whole six-minute presentation below and check out the other Zuul talks from the Summit.

Get involved

Get the Source
Zuul is Free and Open Source Software. Download the source from git.zuul-ci.org or install it from PyPI.

Read the Docs
Zuul offers extensive documentation.

Join the Mailing List
Zuul has mailing lists for announcements and discussions.

Chat on IRC
Join #zuul on FreeNode.

The post What’s next for Zuul appeared first on Superuser.

by Superuser at May 23, 2019 02:21 PM

Aptira

Network Functions Virtualisation Orchestration (NFVO) Homologation

A major Telco needs to establish whether a particular Network Functions Virtualisation Orchestration (NFVO) solution performed correctly, not only as per the vendor claims, but also as per their specific market requirements.


The Challenge

Network Functions Virtualisation (NFV) Orchestration (NFVO) co-ordinates the resources and networks needed to set up Cloud-based services and applications. The customer has a long-established model lab which closely mimicked their operational production environment. The customer’s Model lab included instances of various compute hardware and network equipment. There were multiple challenges:

  • The time available in the Model Lab for this exercise was limited
  • Remote access to the infrastructure was intermittent and technically complex
  • Limited support resources
  • The verification constraints for the homologation process were very specific to the customer

The Aptira Solution

This assignment required both broad and deep technical knowledge and the ability to think on the fly as problems arose or technical requirements are clarified.

In addition to our own internal team, we reached out to our network of partners and identified a team of software engineers across Israel, Ukraine and Poland who could provide extra support across multiple time zones for this project. A team was spun up including Project management and Technical leadership in Australia on-site with the customer, and software engineers spanned across 4 continents. The virtual team co-ordinated activities using Jira, Confluence and Slack.

This team was able to assign tasks amongst themselves to work in parallel:

  • Lab access, environment detail
  • VIM configuration and core software installation
  • Orchestration policy development and testing

Most of the work was completed on the developer’s own infrastructure and integrated into Aptira’s lab environment (with the appropriate simulation of external interfaces). Only once we were sure that the orchestration policies were working from a logic perspective did we schedule access to the customer’s Model Lab to install and test the configuration.  

It was only after the orchestration configurations were installed that we could actually interface with the very specific items of equipment required by the customer. These items include: 

  • Cloudify Manager cluster in VMware and VMs in OpenStack targeted to be orchestrated by Cloudify
  • Cloudify Manager and vCenter
  • Ericsson Cloud Manager (ECM) and Ericsson Execution Environment (EEC)

After each use case was validated by the customer, it was then rolled out of the Model Lab to free up resources. This validation included:

  • Cloudify Manager HA failover
  • CLAMP functions such as Auto-heal and Auto-scale
  • LDAP-based Authentication
  • Ansible Integration using Netconf
  • User RBAC and Resource Authorisation process (custom development)
  • Alarm Generation
  • Reporting

During the validation process, having resources on different continents meant that we had a de-facto follow-the-sun support arrangement. As such, we were able to fix problems rapidly if we encountered issues in the customer’s Model Lab.


The Result

Although the total job was not huge, the customers’ lab constraints and the specificity of the validation requirements meant that this was an exacting assignment requiring great attention to detail.

As a result of this assignment, we were able to confirm that the Network Functions Virtualisation Orchestration (NFVO) solution performed correctly, not only as per the vendor claims, but also as per their specific market requirements. 

This project could not have been completed without the support of our partners. This is why we go to great lengths to select the best of the best when it comes to technology partners. We do this to provide our customers with innovative solutions that bring better consistency, performance and flexibility. With these partnerships, we’re able to deliver seamless services worldwide without the limitations of operating across multiple time zones.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Network Functions Virtualisation Orchestration (NFVO) Homologation appeared first on Aptira.

by Aptira at May 23, 2019 01:58 PM

May 22, 2019

OpenStack Superuser

Takeaways from the first open multi-vendor NFV showcase

At the recent Open Infrastructure Summit in Denver, Whitestack announced the results of the initiative called “Open Multivendor NFV Showcase,” an effort to demonstrate that network function virtualization-orchestrated network services, integrating VNF from multiple vendors, on top of commoditized hardware, is possible.

This effort, organized by Whitestack, has the support of relevant institutions in the NFV field, in particular: Intel, the OpenStack Foundation, Open Source MANO and the ETSI NFV Plugtests Programme.

For this first edition, Whitestack invited a number of vendors and projects that provide a complete end-to-end service chain, covering critical parts of a typical mobile network. Specifically, the following VNF vendors were integrated to provide a fully-functional and automated network service:

  • Fortinet: Next Generation FW.
  • Open Air Interface: LTE EPC core.
  • Mobileum: Diameter Routing Agent and Network Traffic Redirection.
  • ng4T: vTester supporting Cloud-RAN, vHSS and other functions.

See the complete session, including a live demo here or below and download the complete report by clicking here.

The post Takeaways from the first open multi-vendor NFV showcase appeared first on Superuser.

by Gianpietro Lavado at May 22, 2019 02:04 PM

Aptira

OpenStack Container Orchestration with Kolla-Ansible

Aptira OpenStack Kolla Ansible

A leading international Software Defined Networks (SDN) and Data Analytics Provider would like to upgrade their applications to utilise OpenStack and Container Orchestration, but were running into complications and needed a bit of extra help.


The Challenge

This customer had attempted to deploy OpenStack on their own several times but had run into complications. OpenStack is not their expertise and their design was based on TripleO – which is quite complicated to deploy and operate. They required help with the platform design and configuration so they can containerise their applications. As they are located overseas, our Solutionauts were operating across time-zones and completing all work remotely.


The Aptira Solution

Aptira designed a containerised OpenStack solution utilising Ceph as the backend storage for image, volume and compute. We then used kolla-ansible to deploy Ceph and OpenStack. We chose this configuration because it’s relatively simple to use kolla-ansible to customise configurations and to change the deployment, making ongoing configurations easier on the customer once the project has been handed over.

The Ceph cluster had 4 replicas, with Ceph mons/mgr running on 3 rack servers, while the object storage devices (OSDs) are running across 8 blade servers in two chassis. There are three regions to host their apps in different failure domains for redundancy. The OpenStack controllers were converged with Ceph mons on rack servers, and Compute collocated with OSDs on blade servers.

We successfully resolved a number of issues that arose during the implementation. One issue we faced was a memory leak bug in the OVS code which had not yet been fixed in upstream. As a temporary workaround, we were required to restart the neutron agent services regularly to release memory until the bug has been fixed upstream. In order to speed up this process and remove manual intervention, we setup a cron job which will restart neutron agent services on out of business hours.

Another separate challenge was found in the default haproxy maxconn which was not large enough, resulting in instability. To resolve this, we increased the haproxy maxconn value in the haproxy config file, improving the stability of the platform.


The Result

We delivered the Ceph-backed OpenStack Cloud on which their applications are now deployed. The configuration has passed their HA tests and is being used in production.

It is important to note that we deployed OpenStack Rocky which was the latest stable OpenStack release at the time of this project. Unfortunately, Kolla-ansible is unable to complete upgrades and downgrades on this version, so future work will be required in order to simplify the upgrade/downgrade process. Stay tuned!


Orchestrate your Application into the Future
Containerise your App

Get Started

The post OpenStack Container Orchestration with Kolla-Ansible appeared first on Aptira.

by Aptira at May 22, 2019 01:31 PM

Thomas Goirand

Wrote a Debian mirror setup puppet module in 3 hours

As I needed the functionality, I wrote this:

https://salsa.debian.org/openstack-team/puppet/puppet-module-debian-archvsync

The matching Debian package has been uploaded and is now in the NEW queue. Thanks a lot to Waldi for packaging ftpsync, which I’m using.

Comments and contributions are welcome.

by Goirand Thomas at May 22, 2019 12:40 PM

CERN Tech Blog

Cluster Autoscaling for Magnum Kubernetes Clusters

The Kubernetes cluster autoscaler has been in development since 2016, with early support for the major public cloud providers for Kubernetes. But, there has been no way to use it running Kubernetes on OpenStack until now, with the addition of the autoscaler cloud provider for Magnum. As an OpenStack cloud operator with around 400 Magnum clusters (majority Kubernetes), CERN has a lot to gain from the flexibility that the cluster autoscaler provides.

by CERN (techblog-contact@cern.ch) at May 22, 2019 10:00 AM

May 21, 2019

OpenStack Superuser

Firecracker and Kata Containers: Sparking more open collaboration

DENVER — Some pairings really do spark joy. Peanut butter and chocolate. Wine and cheese. Biscuits and gravy. The concept crosses over to the tech world: Firecracker and Kata Containers.

On the Open Infrastructure keynote stage in Denver, Samuel Ortiz, architecture committee, Kata Containers and Andreea Florescu, maintainer, Firecracker project, talked about how the projects are working together.

The pair introduced a new collaborative project: rust-vmm. Firecracker allows Kata Containers to support a large number of container workloads, but not all of them. OSF, Amazon, Intel, Google and others are now collaborating to build a custom container hypervisor. Enter rust-vmm, a project featuring shared virtualization components to build specialized VMMs.

But let’s get up to speed on the two projects and what’s next for them in detail.

Kata Containers

Kata Containers aim to improve security in the container ecosystem by adding lightweight VMs and hypervisors as another, hardware-based workload isolation layer for containers. Kata Containers has offered a number of enhancements since May 2018 (six releases to date, with another shipping soon) including:

  • Improved performance with VM templating, TC mirroring to improve networking performance and the soon-to-be integrated Virtio-fs support.
  • Improved simplicity and operability by adding distributed tracing support, live update and overall simplified architecture based on vsock.
  • Improved industry support by adding new hardware architectures like ARM64, ppc64 and s390
  • Even stronger security architecture by adding more libcontainer-based isolation layers inside the virtual machine, but most importantly by supporting more hypervisors, including QEMU, NEMU and Firecracker.

Firecracker

Firecracker is an open-source, lightweight virtual machine monitor written in Rust. It leverages Linux Kernel Virtual Machine (KVM) to provide isolation for multi-tenant cloud workloads like containers and functions.

What makes it great:

  • It “boots blazingly fast” (under 125 milliseconds)
  • Low memory footprint, helping it achieve high densities (<5MiB)
  • Oversubscription
  • Security: two boundaries–virtualization and jailer

Florescu also outlined some of the main enhancements in progress:

  • ARM and AMD support
  • Refactoring the codebase for standalone virtualization components that can be used by other projects.
  • Container integration: Transitioning from an experimental implementation of Vsock to a production ready version; also integrating firecracker-containerd, which is a container runtime on top of Firecracker.

Check out the whole 12-minute keynote below and stay tuned for a video from their Summit session titled “Tailor-made security: Building a container specific hypervisor.”

Photo // CC BY NC

The post Firecracker and Kata Containers: Sparking more open collaboration appeared first on Superuser.

by Superuser at May 21, 2019 02:07 PM

Aptira

Zenoss Implementation

Zenoss Monitoring

This use case covers two of a Telco’s custom-developed platforms that provide network overlay services to Fortune 500 companies and government entities. They require platform-wide monitoring and would like to utilise Zenoss – an intelligent application and service monitoring tool.


The Challenge

The two custom platforms this Telco has developed consist of OpenFlow Network infrastructure, OpenStack Infrastructure and several applications including: Cloudify, an SDN Controller and Network Flow Programming Tools.

The customer had a requirement for platform-wide monitoring to capture operational events in the platform and send them to a third-party dashboard in near real-time. They had already selected (and were previously using) a tool named Zenoss as the event monitoring and management platform and asked Aptira to configure it to meet their requirements.

The platform components that required monitoring included:

Hardware

  • Bare Metal Servers
  • Top of Rack Network Switches
  • Noviflow switches

Software

  • Linux Operating System
  • Server OS (KVM hypervisors)
  • Top of Rack Network Switch Operating System
  • Noviware Operating System

Applications

  • OpenStack + Ceph cluster
  • Cloudify cluster
  • SDN controllers

Their requirements extended to additional metrics and custom events, thresholds and alerts that were not available in the standard platform. In order to implement these requirements, we exploited a feature of Zenoss that allows easy expansion of monitoring capability as modular plugins. These plugins are called ZenPacks.

Some of the customer’s requirements were covered in existing Zenpacks, e.g. the OpenStack and Bare Metal Server ILO Zenpacks. However, most requirements were not covered by any existing Zenpacks. Examples of components that needed additional capabilities include:

  • Cloudify Services and Cluster health check
  • SDN controller service and UI health check
  • SDN Etree, Eline services status
  • Noviflow Eline and Etree paths
  • Noviflow CLI, OF, Physical ports status, etc

Implementing these additional capabilities was a key objective of Aptira’s solution.


The Aptira Solution

Aptira developed custom capabilities for ZenOSS to provide functionality for custom monitoring, and to send alerts to a third-party dashboard. To fulfil these additional requirements, we were able to develop custom plugin capabilities.

We considered the option of enhancing existing Zenpacks like OpenStack, Bare Metal Servers ILO, and Linux. However, this would have created dependencies to multiple existing Zenpacks that would have complicated their lifecycle management. Any enhancements of existing Zenpacks would have to be continually updated with the custom-developed enhancements.

Instead, we developed a single custom Zenpack and integrated it with ZenOSS and the platforms, implementing the required functionality and making it easy to update and maintain existing Zenpacks.

To send the alerts to the third-party dashboard, we created Ansible playbooks which can easily add/remove devices to/from Zenoss and the third-party alerting dashboard.

Aptira also developed Ansible playbooks to perform maintenance functions on the integrated Zenoss solution, including adding devices, configuring events and triggers and notification/alerts via those playbooks.

This entire process was completed (including requirements, design, development and configuration) within the customer’s platform. This included:

  • Adding the devices in the third-party alerting dashboard and Zenoss
  • Configuring the Ansible playbooks that perform events notification
  • Installed the custom-developed Zenpack on the customer’s operational Zenoss system

The Result

Aptira was able to successfully complete this enhancement and enabling the customer to monitor all devices with their custom requirements. Events are now visible within Zenoss as well as the third-party dashboard.


Monitoring and Machine Learning
Detect Anomalies within Complex Systems

Find Out Here

The post Zenoss Implementation appeared first on Aptira.

by Aptira at May 21, 2019 01:37 PM

Carlos Camacho

The Kubernetes in a box project

Implementing cloud computing solutions that runs in hybrid environments might be the final solution when comes to finding the best benefits/cost ratio.

This post will be the main thread to build and describe the KIAB/Kubebox project (www.kubebox.org and/or www.kiab.org).

Spoiler alert!

The name

First thing first, the name.. I have in my mind two names having the same meaning. The first one is KIAB (Kubernetes In A Box) this name came to my mind as the Kiai sound from karatekas (practitioners of karate). The second one is more traditional, “Kubebox”. I have no preference but it would be awesome if you help me to decide the official name for this project.

Add a comment and contribute to select the project name!

Introduction

This project is about to integrate together already market available devices to run cloud software as an appliance.

The proof-of-concept delivered in this series of posts will allow people to put a well-known set of hardware devices into a single chassis for either create their own cloud appliances, research and development, continuous integration, testing, home labs, staging or production-ready environments or simply just for fun.

Hereby it’s humbly presented to you the design of KubeBox/KIAB an open chassis specification for building cloud appliances.

The case enclosure is fully designed, and hopefully in the last phases for building the first set of enclosures, now, the posts will appear in the mean time I have some free cycles for writing the overall description.

Use cases

Several use cases can be defined to run on a KubeBox chassis.

  • AWS outpost.
  • Development environments.
  • EDGE.
  • Production Environments for small sites.
  • GitLab CI integration.
  • Demos for summits and conferences.
  • R&D: FPGA usage, deep learning, AI, TensorFlow, among many others.
  • Marketing WOW effect.
  • Training.

Enclosure design

The enclosure is designed as a rackable unit, using 7U. It tries to minimize the space used to deploy an up to 8-node cluster with redundancy for both power and networking.

Cloud appliance description

This build will be described across several sub-posts linked from this main thread. The posts will be created particularly without any specific order depending on my availability.

  • Backstory and initial parts selection.
  • Designing the case part 1: Design software.
  • A brief introduction to CAD software.
  • Designing the case part 2: U’s, brakes, and ghosts.
  • Designing the case part 3: Sheet thickness and bend radius.
  • Designing the case part 4: Parts Allowance (finish, tolerance, and fit).
  • Designing the case part 5: Vent cutouts and frickin’ laser beams!.
  • Designing the case part 6: Self-clinching nuts and standoffs.
  • Designing the case part 7: The standoffs strike back.
  • A brief primer on screws and PEMSERTs.
  • Designing the case part 8: Implementing PEMSERTs and screws.
  • Designing the case part 9: Bend reliefs and flat patterns.
  • Designing the case part 10: Tray caddy, to be used with GPU, Mother boards, disks, any other peripherals you want to add to the enclosure.
  • Designing the case part 11: Components rig.
  • Designing the case part 12: Power supply.
  • Designing the case part 13: Networking.
  • Designing the case part 14: 3D printed supports.
  • Designing the case part 15: Adding computing power.
  • Designing the case part 16: Adding Storage.
  • Designing the case part 17: Front display and bastion for automation.
  • Manufacturing the case part 1: PEMSERT installation.
  • Manufacturing the case part 2: Bending metal.
  • Manufacturing the case part 3: Bending metal.
  • KubeBox cloud appliance in detail!.
  • Manufacturing the case part 0: Getting quotes.
  • Manufacturing the case part 1: Getting the cases.
  • Software deployments: Reference architecture.
  • Design final source files for the enclosure design.
  • KubeBox is fully functional.

Update log:

2019/05/21: Initial version.

by Carlos Camacho at May 21, 2019 12:00 AM

May 20, 2019

OpenStack Superuser

Inside open infrastructure: The latest from the OpenStack Foundation

Welcome to the latest edition of the OpenStack Foundation Open Infrastructure newsletter, a digest of the latest developments and activities across open infrastructure projects, events and users. Sign up to receive the newsletter and email community@openstack.org to contribute.

Spotlight on the Open Infrastructure Summit Denver

The global community gathered recently in Denver for the Open Infrastructure Summit followed by the Project Teams Gathering (PTG). This was the first edition under the new name, which was changed to better reflect the diversity of open-source communities collaborating at the Summit. With the co-location of the PTG, the week had more collaborative sessions than ever and attendees had the opportunity to collaborate throughout the week with presentations, workshops, and collaborative sessions covering the development, integration and deployment of more than 30 open-source projects.

The theme of the week was “Collaboration without Boundaries,” a call to the community shared by Jonathan Bryce in his Monday morning keynote. Collaboration was exemplified throughout the week from the developers, operators and vendors attending the event:

  • Developers from the Kata Containers and Firecracker projects highlighted the progress around community collaboration and project integration. They also discussed Rust-VMM, a cross-project collaborative initiative to develop container-specific hypervisors.
  • Operators from Baidu, Blizzard Entertainment, CERN, Box, Adobe Advertising Cloud and more presented their open infrastructure use cases, highlighting the integration of multiple technologies including Ceph, Kata Containers, Kubernetes, and OpenStack among 30+ other projects.
  • 5G was front and center at the Denver Summit. In a demonstration of open collaboration, Ericsson partnered with AT&T to host a 5G Lounge where attendees could test the latency of network speeds while playing a virtual reality game, Strike a Light. Users like China Mobile and AT&T presented about their 5G deployments. At AT&T, 5G is powered by an Airship-based containerized OpenStack cloud.
  • The NIST Public Working Group on Federated Cloud and The Open Research Cloud Alliance discussed possible federation deployment and governance models that embody the key concepts and design principles being developed in the NIST/IEEE Joint working group and ORCA. They want to encourage developers, users and cloud operators to provide use cases and feedback as they move forward in these efforts.

 

Amy Wheelus and Mark Collier in an epic latency battle with 3G, 4G and 5G on the keynote stage.

Denver Summit session videos are now available and for more Summit announcements, user sessions and news from the Open Infrastructure ecosystem, check out the Superuser recap.
Next, the Open Infrastructure Summit and PTG are heading to Shanghai. Registration and Sponsorship sales are now open. If you’re interested in speaking, the Call for Presentations is open. Check out the list of Tracks and submit your presentations, panels and workshops before July 2, 2019.

OpenStack Foundation news

  • At the Open Infrastructure Summit, the OpenStack Board of Directors confirmed Zuul as a top-level Open Infrastructure project, joining OpenStack and recently confirmed Kata Containers.
  • The OSF launched the OpenStack Ironic Bare Metal Program in Denver, highlighting the commercial ecosystem for Ironic, at-scale deployments of Ironic, and evolution of OpenStack beyond virtual machines. Case studies by CERN and Platform9 were published along with the announcement

OpenStack Foundation project news

Airship

  • The Airship team delivered its first release at the Open Infrastructure Summit Denver. Airship 1.0 delivers a wide range of enhancements to security, resiliency, continuous integration and documentation, as well as upgrades to the platform, deployment and tooling features.

Kata Containers

  • The community delivered several talks during the Open Infrastructure Summit in Denver that you can check out among the videos from the event.
  • Kata Containers continues to provide improvements around performance, stability and security.  Expected this week, the 1.7 release of Kata Containers includes experimental support for virtio-fs in the NEMU VMM. For workloads which require host to guest sharing, virtio-fs provides improved performance and compatibility compared to 9pfs. This release adjusts the guest kernel in order to facilitate Docker-in-Kata use cases, and adds support for the latest version of Firecracker.

OpenStack

StarlingX

  • There were five sessions dedicated to the project at the Open Infrastructure Summit in Denver, check out the videos here.
  • Participants packed room for a hands-on workshop to try out the StarlingX platform on  hardware donated by Packet.com. If you missed this one, keep an eye out for similar workshops at community upcoming industry events.
  • The team had great discussions during Forum sessions as well as at the PTG to deep dive into the details of processes, testing and roadmap planning for the upcoming two releases.

Zuul

  • The community delivered several talks during the Open Infrastructure Summit in Denver that you can check out among the videos from the event.
  • Zuul 3.8.1 was released, fixing a memory leak introduced in the previous 3.8.0 release. Users should update to at least 3.8.0 to fix this bug. More info can be found in the release notes.
  • Nodepool 3.6.0 was released. This release improves API rate limiting against OpenStack clouds and statsd metric gathering performed by the builder process. Find more info in the release notes.

OSF @ Open Infrastructure community events

Questions / feedback / contribute

This newsletter is written and edited by the OpenStack Foundation staff to highlight open infrastructure communities. We want to hear from you!
If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside open infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by OpenStack Foundation at May 20, 2019 02:06 PM

Aptira

Big Data

One of Australia’s largest and well-known organisations has an internal Big Data team who are managing their system in a static and traditional way. But this makes it difficult for them to expand the system. So we designed a solution they can use to upgrade their Big Data system from static infrastructure onto a flexible OpenStack Cloud.


The Challenge

The customer is running an internal Big Data system, which collects data from various internal data sources, providing a static view of this data to the management team. They are using a Hadoop-based system and have been experiencing issues with scalability. As such, they want to explore a Cloud-based hosting platform, which will give them more flexibility and scalability. They’d also like to migrate their existing production system onto the new hosting platform once it is proven to be stable and production ready.

The design of the Cloud platform had to balance short term specific goals against long term objectives. The Customer’s future vision for the platform included minimisation of the total cost of operational ownership, by designing the platform to require minimal operational intervention, and by maximising the use of automation in all stages of the operations lifecycle.

The challenge was even greater because Aptira was brought into the project relatively late in its development cycle, and therefore many decisions had already been made. Examples include a hardware and networking platforms having been selected, the rack placement of these machines being already defined, and the general approach to management of virtual and physical machines already having been determined.


The Aptira Solution

We may be a little biased when it comes to OpenStack, but there really is no better system for requirements such as these. As such, we’ve offered to build an OpenStack cloud to host their Big Data system.

This OpenStack will be integrated with their existing Cisco Application Centric Infrastructure (ACI) in order to provide a complete end-to-end Software Defined Networking (SDN) solution.

A complication that Aptira had to design around was the version compatibilities across the integrated subsystem. For example, Cisco ACI was only supported on OpenStack Ocata, which was rapidly approaching end of support. The integration design required painstaking attention to detail to enable functionality while at the same time honouring the design decisions made by the customer.

We have produced a complete design document which they can use to guide their OpenStack and upgrade their Big Data system from static infrastructure onto a flexible OpenStack Cloud solution.


The Result

This project is still in development – stay tuned for updates!


How can we make OpenStack work for you?
Find out what else we can do with OpenStack.

Find Out Here

The post Big Data appeared first on Aptira.

by Aptira at May 20, 2019 01:39 PM

Carlos Camacho

Running Relax-and-Recover to save your OpenStack deployment

ReaR is a pretty impressive disaster recovery solution for Linux. Relax-and-Recover, creates both a bootable rescue image and a backup of the associated files you choose.

When doing disaster recovery of a system, this Rescue Image plays the files back from the backup and so in the twinkling of an eye the latest state.

Various configuration options are available for the rescue image. For example, slim ISO files, USB sticks or even images for PXE servers are generated. As many backup options are possible. Starting with a simple archive file (eg * .tar.gz), various backup technologies such as IBM Tivoli Storage Manager (TSM), EMC NetWorker (Legato), Bacula or even Bareos can be addressed.

The ReaR written in Bash enables the skilful distribution of Rescue Image and if necessary archive file via NFS, CIFS (SMB) or another transport method in the network. The actual recovery process then takes place via this transport route.

In this specific case, due to the nature of the OpenStack deployment we will choose those protocols that are allowed by default in the Iptables rules (SSH, SFTP in particular).

But enough with the theory, here’s a practical example of one of many possible configurations. We will apply this specific use of ReaR to recover a failed control plane after a critical maintenance task (like an upgrade).

01 - Prepare the Undercloud backup bucket.

We need to prepare the place to store the backups from the Overcloud. From the Undercloud, check you have enough space to make the backups and prepare the environment. We will also create a user in the Undercloud with no shell access to be able to push the backups from the controllers or the compute nodes.

groupadd backup
mkdir /data
useradd -m -g backup -d /data/backup backup
echo "backup:backup" | chpasswd
chown -R backup:backup /data
chmod -R 755 /data

02 - Run the backup from the Overcloud nodes.

Let’s install some required packages and run some previous configuration steps.

#Install packages
sudo yum install rear genisoimage syslinux lftp wget -y

#Make sure you are able to use sshfs to store the ReaR backup
sudo yum install fuse -y
sudo yum groupinstall "Development tools" -y
wget http://download-ib01.fedoraproject.org/pub/epel/7/x86_64/Packages/f/fuse-sshfs-2.10-1.el7.x86_64.rpm
sudo rpm -i fuse-sshfs-2.10-1.el7.x86_64.rpm

sudo mkdir -p /data/backup
sudo sshfs -o allow_other backup@undercloud-0:/data/backup /data/backup
#Use backup password, which is... backup

Now, let’s configure ReaR config file.

#Configure ReaR
sudo tee -a "/etc/rear/local.conf" > /dev/null <<'EOF'
OUTPUT=ISO
OUTPUT_URL=sftp://backup:backup@undercloud-0/data/backup/
BACKUP=NETFS
BACKUP_URL=sshfs://backup@undercloud-0/data/backup/
BACKUP_PROG_COMPRESS_OPTIONS=( --gzip )
BACKUP_PROG_COMPRESS_SUFFIX=".gz"
BACKUP_PROG_EXCLUDE=( '/tmp/*' '/data/*' )
EOF

Now run the backup, this should create an ISO image in the Undercloud node (/data/backup/).

You will be asked for the backup user password

sudo rear -d -v mkbackup

Now, simulate a failure xD

# sudo rm -rf /lib

After the ISO image is created, we can proceed to verify we can restore it from the Hypervisor.

03 - Prepare the hypervisor.

# Enable the use of fusefs for the VMs on the hypervisor
setsebool -P virt_use_fusefs 1

# Install some required packages
sudo yum install -y fuse-sshfs

# Mount the Undercloud backup folder to access the images
mkdir -p /data/backup
sudo sshfs -o allow_other root@undercloud-0:/data/backup /data/backup
ls /data/backup/*

04 - Stop the damaged controller node.

virsh shutdown controller-0
# virsh destroy controller-0

# Wait until is down
watch virsh list --all

# Backup the guest definition
virsh dumpxml controller-0 > controller-0.xml
cp controller-0.xml controller-0.xml.bak

Now, we need to change the guest definition to boot from the ISO file.

Edit controller-0.xml and update it to boot from the ISO file.

Find the OS section,add the cdrom device and enable the boot menu.

<os>
<boot dev='cdrom'/>
<boot dev='hd'/>
<bootmenu enable='yes'/>
</os>

Edit the devices section and add the CDROM.

<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/data/backup/rear-controller-0.iso'/>
<target dev='hdc' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='1' target='0' unit='0'/>
</disk>

Update the guest definition.

virsh define controller-0.xml

Restart and connect to the guest

virsh start controller-0
virsh console controller-0

You should be able to see the boot menu to start the recover process, select Recover controller-0 and follow the instructions.

Now, before proceeding to run the controller restore, it’s possible that the host undercloud-0 can’t be resolved, just:

echo "192.168.24.1 undercloud-0" >> /etc/hosts

Having resolved the Undercloud host, just follow the wizard, Relax And Recover :)

You yould see a message like:

Welcome to Relax-and-Recover. Run "rear recover" to restore your system !

RESCUE controller-0:~ # rear recover

The image restore should progress quickly.

Continue to see the restore evolution.

Now, each time you reboot the node will have the ISO file as the first boot option so it’s something we need to fix. In the mean time let’s check if the restore went fine.

Reboot the guest booting from the hard disk.

Now we can see that the guest VM started successfully.

Now we need to restore the guest to it’s original definition, so from the Hypervisor we need to restore the controller-0.xml.bak file we created.

#From the Hypervisor
virsh shutdown controller-0
watch virsh list --all
virsh define controller-0.xml.bak
virsh start controller-0

Enjoy.

Considerations:

  • Space.
  • Multiple protocols supported but we might then to update firewall rules, that’s why I prefered SFTP.
  • Network load when moving data.
  • Shutdown/Starting sequence for HA control plane.
  • Do we need to backup the data plane?
  • User workloads should be handled by a third party backup software.

Update log:

2019/05/20: Initial version.

2019/06/18: Appeared in OpenStack Superuser blog.

by Carlos Camacho at May 20, 2019 12:00 AM

May 17, 2019

Chris Dent

Placement Update 19-19

Woo! Placement update 19-19. First one post PTG and Summit. Thanks to everyone who helped make it a useful event for Placement. Having the pre-PTG meant that we had addressed most issues prior to getting there meaning that people were freed up to work in other areas and the discussions we did have were highly coherent.

Thanks, also, to everyone involved in getting placement deleted from nova. We did that while at the PTG and had a little celebration.

Most Important

We're still working on narrowing priorities and focusing the details of those priorities. There's an etherpad where we're taking votes on what's important. There are three specs in progress from that that need review and refinement. There are two others which have been put on the back burner (see specs section below).

What's Changed

  • We're now running a subset of nova's functional tests in placement's gate.

  • osc-placement is using the PlacementFixture to run its functional tests making them much faster.

  • There's a set of StoryBoard worklists that can be used to help find in progress work and new bugs. That section also describes how tags are used.

  • There's a summary of summaries email message that summarizes and links to various results from the PTG.

Specs/Features

As the summary of summaries points out, we have two major features this cycle, one of which is large: getting consumer types going and getting a whole suite of features going to support nested providers in a more effective fashion.

  • https://review.opendev.org/654799 Support Consumer Types. This is very close with a few details to work out on what we're willing and able to query on. It only has reviews from me so far.

  • https://review.opendev.org/658510 Spec for Nested Magic. This is associated with a lengthy story that includes visual artifacts from the PTG. It covers several related features to enable nested-related requirements from nova and neutron. It is a work in progress, with several unanswered questions. It is also something that efried started but will be unable to finish so the rest of us will need to finish it up as the questions get answered. And it also mostly subsumes a previous spec on subtree affinity. (Eric, please correct me if I'm wrong on that.)

  • https://review.opendev.org/657582 Resource provider - request group mapping in allocation candidate. This spec was copied over from nova. It is a requirement of the overall nested magic theme. While it has a well-defined and refined design, there's currently no one on the hook implement it.

There are also two specs that are still live but de-prioritized:

These and other features being considered can be found on the feature worklist.

Some non-placement specs are listed in the Other section below.

Stories/Bugs

There are 23 stories in the placement group. 0 are untagged. 4 are bugs. 5 are cleanups. 12 are rfes. 2 are docs.

If you're interested in helping out with placement, those stories are good places to look.

On launchpad:

Of those there two interesting ones to note:

  • https://bugs.launchpad.net/nova/+bug/1829062 nova placement api non-responsive due to eventlet error. When using placement-in-nova in stein, recent eventlet changes can cause issues. As I've mentioned on the bug the best way out of this problem is to use placement-in-placement but there are other solutions.

  • https://bugs.launchpad.net/nova/+bug/1829479 The allocation table has residual records when instance is evacuated and the source physical node is removed. This appears to be yet another issue related to orphaned allocations during one of the several move operations. The impact they are most concerned with, though, seems to be the common "When I bring up a new compute node with the same name there's an existing resource provider in the way" that happens because of the unique constrain on the rp name column.

I'm still not sure that constraint is the right thing unless we want to make people's lives hard when they leave behind allocations. We may want to make it hard because it will impact quota...

osc-placement

osc-placement is currently behind by 11 microversions. No change since the last report.

Pending changes:

Note: a few of these having been sitting for some time with my +2 awaiting review by some other placement core. Please remember osc-placement when reviewing.

Main Themes

Now that the PTG has passed some themes have emerged. Since the Nested Magic one is rather all encompassing and Cleanup is a catchall, I think we can consider three enough. If there's some theme that you think is critical that is being missed, let me know.

For people coming from the nova-side of the world who need or want something like review runways to know where they should be focusing their review energy, consider these themes and the links within them as a runway. But don't forget bugs and everything else.

Nested Magic

At the PTG we decided that it was worth the effort, in both Nova and Placement, to make the push to make better use of nested providers — things like NUMA layouts, multiple devices, networks — while keeping the "simple" case working well. The general ideas for this are described in a story and an evolving spec.

Some code has started, mostly to reveal issues:

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A spec has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound.

Cleanup

As we explore and extend nested functionality we'll need to do some work to make sure that the code is maintainable and has suitable performance. There's some work in progress for this that's important enough to call out as a theme:

Ed Leafe has also been doing some intriguing work on using graph databases with placement. It's not yet clear if or how it could be integrated with mainline placement, but there are likely many things to be learned from the experiment.

Other Placement

  • https://review.opendev.org/#/q/topic:refactor-classmethod-diaf A suite of refactorings that given their lack of attention perhaps we don't need or want, but let's be explicit about that rather than ignoring the patches if that is indeed the case.

  • https://review.opendev.org/645255 A start at some unit tests for the PlacementFixture which got lost in the run up to the PTG. They may be less of a requirement now that placement is running nova's functional tests. But again, we should be explicit about that decision.

Other Service Users

New discoveries are added to the end. Merged stuff is removed.

End

I'm out of practice on these things. This one took a long time.

by Chris Dent at May 17, 2019 03:32 PM

About

Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.

Subscriptions

Last updated:
June 19, 2019 12:52 PM
All times are UTC.

Powered by:
Planet