### ### Planet OpenStack

Planet OpenStack

May 15, 2012

OpenStack Blog

Here is what happens inside Nova when you provision a VM

At the Essex conference summit this past month, we presented a session on  OpenStack Essex architecture. As a part of that workshop we visually demonstrated the request flow for provisioning a VM and went over Essex arthicture. There was a lot of interest in this material; it’s now posted in Slideshare:

In fact, we’ve packaged up the architecture survey/overview as part of our 2-day Bootcamp for OpenStack. The next session is scheduled 14-15 June. This time around will carry out the training at the Santa Clara CA offices of our friends at Nexenta. Last course was delivered at our Mountain View office right before the OpenStack summit in April to a sold out crowd. You can find more information about the course at www.mirantis.com/training

by Mirantis Inc. at May 15, 2012 07:06 PM

Duncan McGreggor

CERN, OpenStack Keep Resonance Cascades at Bay

Tim Bell preparing to get his
OpenStack on
As previously mentioned, there's a growing momentum around ops-oriented participation in the OpenStack community. DreamHost is deeply invested in DevOps, seeing how that's where we're going to be living in a few months! As Simon Anderson, CEO of DreamHost, recently said:
"When we're running a complex fabric of apps on over 5,000 servers across three data centers, we need a lean and nimble approach to software development and operational implementation. Without a DevOps approach, we wouldn't be able to push code into production as fast or as efficiently as we do, and our customers would not be happy! Today's developers demand up-to-the-hour security and performance updates to Internet infrastructure, so we aim to deliver just that with DevOps."
Though expressed in the context of our work, the import of DevOps that Simon's comment generally highlights is going to be increasingly important for nearly anyone running cloud services. 

In particular, I've been following the work of the intrepid folks at CERN. As such, this post is not about DreamHost; rather, it's a mad tale of OpenStack, DevOps, and averting alien invasion.

After countless long-distance phone conversations, a flight to Switzerland, and spending several days buying pints for a security guard in the know (referred to from now on as "Barney"), I've uncovered some profound truths -- Mulder-style -- and have confirmed that the impact of OpenStack at CERN is huge. 

Superficial examinations turn up the usual: CERN's planning slides, nice quotes, discussions of features and savings in time and money. For instance, in a recent email conversation with Tim "Gordon Freeman" Bell at CERN, I learned that 
"The CERN Agile Infrastructure project aims to develop CERN's computing resources and processes to support the expanding needs of LHC physicists and the CERN organisation."
I think these guys have been hanging out with Simon! But once you slip behind the scenes, peek at some of the whiteboards in unattended rooms, or rifle through notes lying about, you see that things are not what they appear. I've included a shot of Mr. OpenStack-at-CERN himself; this was my first clue.

Publicly, he's been working with other teams at CERN to:
  • modernise the data centre configuration tools and automating operations procedures
  • exploit wide scale use of virtualisation, improving flexibility and efficiency
  • enhance monitoring such that the usage of the infrastructure can be fully understood and tuned to maximise the resources available
But privately, it seems that he and his team have been doing much, much more. This was alluded to in a statement made by team member Jan van Eldik: "We expect the number of requests to insert non-standard specimens into the scanning beam of the Anti-Mass Spectrometer to significantly decrease, once automation is in place and everyone is using the standard infrastructure we are setting up."

That isn't to say there haven't been incidents...

Innocuously enough, the current toolchains are based around:
  • OpenStack as a single Infrastructure-as-a-Service providing physics experiment services, developer boxes, applications servers as well as the large batch farm
  • Puppet for configuration management
  • Scientific Linux CERN as the dominant operating system with sizeable chunk of Windows installs
But that second bullet caught my eye, and one of Barney's pub mates confirmed a rumor that we'd heard: the Puppet instances are actually trained headcrabs. The primary training tool? You guessed it, a crowbar. Barney said that the folks from Dell took inspiration from this and developed it further for their OpenStack deployment framework after an extended visit to CERN.

Although Barney hadn't seen any evidence of resonance cascades, there have been minor cross-dimensional disturbances as a result of some "cowboy" activity and folks not following DevOps best practices. This has been kept quiet for obvious reasons, but has led to a small pest problem in some of CERN's older tunnel complexes. As rouge elements are discovered, CERN has been educating transgressors aggressively. (Sometimes they go as far as sending employees to Xen training... or was it Xen training?)

One artist's conception of what success will
look like for OpenStack at CERN
Despite the minor hiccoughs along the way, CERN is aiming for success. (Given the lack of Combine and forced relocation programs, they're already doing better than Black Mesa's Anomalous Materials team.) Plans are in place for an initial pre-production service, OpenStack deployment this year. Following that, they will be moving towards 300,000 virtual machines on 15,000 hosts spread across two data centres by 2015.

The OpenStack community is supporting them in their efforts with fantastic new features, high-quality discussions on the mail lists, and real-time interaction on the IRC channels. In an act of reciprocity and community spirit, operators at CERN have volunteered to contribute back to the OpenStack community with regard to operations best practices, reference architecture documentation, and support on the operators' mail list.

To see how other institutions were taking this news, I spent several days waiting on hold. In particular, Aperture Science could not be reached for comment. However, Ops team member Belmiro Rodrigues Moreira did say that there's an audio file being circulated at CERN of Cave Johnson threatening to "burn down OpenStack" ... with lemons. Given Aperture Science's failure record with time machine development, it's generally assumed to be a prank audio reconstruction. CloudStack developers are considered to be the prime suspects, seeing how much time they have on their hands while waiting for ant to finish compiling the latest Java contributions.

When asked what advice he could give to shops deploying OpenStack, Tim said simply: "Remember, the cake is a lie. Don't get distracted and don't stop. Just keep hacking."

Alyx, explaining to her dad why she loves DreamHost
Couldn't have said it better myself.

In closing, and interestingly enough, one of DreamHost's employees has an uncle who works at the Black Mesa Research Facility. Though his teleportation research team was too busy for an extended interview, his daughter did mention that she is a DreamHost customer and can't wait to use OpenStack while interning at CERN next summer. After all, that's what she uses to auto-scale her WordPress blog (she's in our private beta program).

It's a small world.

And, thanks to Tim and the rest at CERN, a safer one, too.


by Duncan McGreggor (noreply@blogger.com) at May 15, 2012 07:05 PM

Yong Sheng Gong

No such pipe, or this pipe has been deleted

This data comes from pipes.yahoo.com but the Pipe does not exist or has been deleted.

May 15, 2012 06:21 PM

May 14, 2012

SwiftStack Team

Swift Part-Power and Performance

We work a lot with smaller-scale Swift clusters. Today we were asked a question about a Swift configuration setting which relates to the number of partitions in the ring data structure that are created. Data lives in partitions and it's up to an operator to decide how many partitions should be created.

Question

I will take this opportunity to ask you a question (if you don't mind and have time) regarding the ring size. I would like to know if you had an experience of a very large ring file (something like 2^22 or 2^23) on a small cluster (between 4 and 20 servers). I would like to know if you saw any impact on performance CPU wise or others. We have good performance with 2^21 right now.. but was wondering if I should change it before launching in case we grow fast.

Answer

Basically, you're not going to chew any more CPU on the Swift nodes by using a bigger ring. You will chew more memory, but not CPU.

Internally, the ring has 2 interesting data structures: a list of arrays called "_replica2part2dev", and a list of devices called "devs". To look up the devices for a partition, the (pseudo)code is something like this:

``` python for replica in xrange(ring.replicas): # ring.replicas is probably 3

dev_id = ring._replica2part2dev[replica][partition]    # list index, then array index
dev = ring.devs[dev_id]      # list index
yield dev    # or do something else useful with it

```

You can see that all those operations are O(1), so increasing your ring size will not materially affect the CPU consumed on your nodes.

If you want to get an idea of how much extra memory would be consumed, grab a python repl (I like ipython) and load up a ring file: ``` python

import gzip import pickle ringdata1 = pickle.load(gzip.GzipFile(filename='/etc/swift/object.ring.gz')) ```

Now check your process's memory usage.

``` python

ringdata2 = pickle.load(gzip.GzipFile(filename='/etc/swift/object.ring.gz')) ```

Now check it again and take the difference. That'll be a decent approximation to how much extra memory would be consumed (per Swift daemon!) by adding 1 to your part_power.

Note that Swift daemons don't do any sort of data-sharing with this big data structure, so every proxy, account/container/object server, replicator, and auditor will each have its own copy of the ring. Remember to multiply the memory-usage delta by the number of daemons you've got running.

Hope this helps.

May 14, 2012 01:21 PM

OpenStack Blog

Indian User Group first meet up

Indian OpenStack User GroupThe first formal meetup of the Indian OpenStack User Group was held in Bangalore last weekend on the 5th of May. The event was attended by 25 enthusiastic InStackers.

We were hoping to see a few more people but there were a couple of other tech events on the same day. The venue for the meetup was the terrace of the Jacaranda Block on Brigade Millennium Rd. The venue was provided by Ahimanikya Satapathy from Fresco Informatics. On the agenda were talks by Govind Tatachari, Deepak Garg from Citrix and Kavit Munshi from Aptira.

Govind did a presentation on Cloud and IaaS and emerging technologies. The presentation was informative and the users started a discussion about what the cloud was and where to employ it. This set up the ground for the next presentation by Deepak.

Deepak discussed the architecture of OpenStack with particular attention to Nova. The users showed a lot of interest and asked a lot of questions with regards to understanding  the underlying architecture of OpenStack.

Kavit discussed OpenStack Swift’s architecture and the various options available to monitor a Swift installation. The talk was very open with most of the time spent with user questions and scenarios they raised. The users also discussed the possibility of doing a live demo or setup of OpenStack at the next meet up. The meetup lasted around 3 hours.

It was great to see the enthusiasm of all the participants and we look forward to our next meeting.
Photos of the event are here. There is also a Linked In group here.

(thanks to Kavit for this blog post)

 

 

 

by Tristan Goode at May 14, 2012 01:01 AM

May 13, 2012

Duncan McGreggor

DreamJob!

Do you love architecting new and creative software? Are you a hacker with mad Python skills and a freak for distributed services? Would you like to see your work offered to a huge, Internet audience? Do you want to help build a community around your work? Do you always vote for the underdog?

I've got just the position for you!

DreamHost is hiring for a new, senior engineering opening on the Cloud Team in the Development Group, and if you can not only easily imagine the extraordinary skill sets necessary to do what we're planning, but also have that skill set, we have got to talk.

We're a small company that's pure heart-and-soul with a culture that simply can't be beat. We've been hiring some incredible talent from the Python and open source communities, and need to finish building out this visionary team that will be taking DreamHost into the next 10 years of online services and software. This role is particularly focused on shaping that future -- from a technical as well as strategic perspective.

You can email me or ping me on IRC (oubiwann on freenode.net). We can chat, and if you've got what it takes, we will set up an interview with the rest of the Cloud Team.

I look forward to hearing from you :-)

Update: There's be a lot of interest in this position from folks with a wide range of professional experiences, so I've shared the job description here. This should give you a good sense of what we're looking for from your past, and the sorts of things we'd be expecting in your future :-)

by Duncan McGreggor (noreply@blogger.com) at May 13, 2012 01:54 AM

May 11, 2012

Zmanda

Zmanda “googles” cloud backup!

Today, we are thrilled to announce a new version of Zmanda Cloud  Backup (ZCB) that backs up to Google Cloud Storage. It feels great to support perhaps the first mainstream cloud storage service we were introduced to (via the breakthrough Gmail and other Google services) and considering the huge promise shown by Google’s cloud services, we are sure that this version will be very useful to many of our customers.

However, a new cloud storage partner explains only part of the excitement. :) What makes this version more significant to us is its new packaging. As you may be aware, until now ZCB came only in a Pay-As-You-Go format and while this option has been great for our customers who value the flexibility offered by this model, we realized that there are our other customers (such as government agencies) who need a fixed amount to put down in their proposals and budget provisions. To put it differently - these customers would rather trade-off some of the flexibility for certainty.

So with these customers in mind, we chose to offer this ZCB version in the following prepaid usage quota based plans:

  • $75/year for 50 GB
  • $100/year for 100 GB
  • $1,000/year for 1000 GB
  • $10,000/year for 10000 GB

Note that the above GB values are the maximum size of data users can store on the cloud at any point in time. The prices above are inclusive of all costs of cloud storage and remain unaffected even if you wish to protect multiple (unlimited!) systems.

    So what are the benefits of this new pricing option? Here are some:

  • Budget friendly: Whether you are an IT manager submitting your annual IT budget for approval or a service provider vying for a client’s business, the all-inclusive yearly plans are a great option, one you can confidently put down in writing.
  • Cost effective: If you know your requirements well, this option turns out to be dramatically cost effective. Here is a rough comparison of our pricing with some other well-known providers: <script src="http://public.tableausoftware.com/javascripts/api/viz_v1.js" type="text/javascript"></script>
    <noscript>Sheet 3 </noscript><object class="tableauViz" height="524" style="display:none;" width="676"><param name="host_url" value="http%3A%2F%2Fpublic.tableausoftware.com%2F"/><param name="site_root" value=""/><param name="name" value="ZCBPriceComparison05162012/Sheet3"/><param name="tabs" value="no"/><param name="toolbar" value="yes"/><param name="static_image" value="http://public.tableausoftware.com/static/images/ZC/ZCBPriceComparison05162012/Sheet3/1.png"/><param name="animate_transition" value="yes"/><param name="display_static_image" value="yes"/><param name="display_spinner" value="yes"/><param name="display_overlay" value="yes"/><param name="display_count" value="yes"/></object>

    Note:
    Zmanda Cloud Backup: The annual plan pricing for Google Cloud Storage version was used.
    MozyPro: Based on http://mozy.com/pro/pricing/ “Server Pass” option was chosen since ZCB protects Server applications at no extra cost.
    JungleDisk: Based on: https://www.jungledisk.com/business/server/pricing/ Rackspace storage option was used since this was the only “all-inclusive” price option

  • More payment options: In addition to credit cards, this version supports a variety of payment options (such as Bank transfer, checks, etc.). So whether you are a government agency or an international firm, mode of payment is never going to be an issue.
  • Simplified billing and account management: Since this aspect is entirely handled by Zmanda, it is much easier and user friendly to manage your ZCB subscription. So no more hassles of updating your credit card information and no need of managing multiple accounts. When you need help, just write to a single email id (zcb@zmanda.com), or open a support case with us, and we will assist you with everything you may need assistance with.
  • Partner friendly: The direct result of all the above benefits is that reselling this ZCB version will be much more simplified and rewarding. If you are interested in learning more, do visit our new reseller page for more details.

So with all the great benefits above, do we still expect some customers to choose our current pay-as-you-go ZCB version for Amazon S3? Of course! As we said, if your needs are currently small or unpredictable, the flexibility of scaling up and down without committing to a long term plan is a sensible option. And the 70 GB free tier and volume discount tier offered on this ZCB version can keep your monthly costs very low.

Oh and I almost forgot - along with this version, we have also announced the availability of ZCB Global Dashboard, the web-interface to track usage and backup activity of multiple ZCB systems at a single place. If you have multiple ZCB systems in your environment or you are a reseller, it will be extra useful to you.

As we work on enhancing our ZCB solution more, please keep sending us your feedback at zcb@zmanda.com. Much more is cooking with Cloud Backup at Zmanda. Will be with you with more exciting news soon!

-Nik

by Nikunj at May 11, 2012 07:41 PM

OpenStack Blog

Community Weekly Review (May 4-11)

OpenStack Community Newsletter – May 4, 2012

HIGHLIGHTS

Upcoming Events

Other news

The weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at May 11, 2012 05:56 PM

May 10, 2012

Stefano Maffulli

Planning an International Community Portal for OpenStack

With the large growth of OpenStack internationally comes the need to have a better system to list the international resources for new users of OpenStack. At the moment we have a couple of wiki pages, a mailing list for a team hosted on Launchpad and the map
the /community page on openstack.org. All that content is available only in English. We’re at the point that this is not enough.

I’d like to discuss the needs of the international community and get a new system in place in the next few weeks. The basic needs are:

  • A directory of OpenStack user groups (OSUG) that can host content in different languages: new visitors should be able to find easily an OpenStack User Group for their local area/language. If such group/language is not available, there should be an easy pointer to instructions, tools and policies to create one
  • A system for the community  managers to contact the members (all members or just the coordinators/leaders?) of the international communities to coordinate activities.

Requirements

  • Register users using SSO: as a user I would like to be able to associate my profile from Launchpad, Linkedin or Google to the site
  • Support content in multiple languages (switch list and automatic recognition via browser agent configuration)
  • Support roles: managers of the groups can add resources to the directory, members can sign up as members, anonymous can read all content
  • Show activity from all groups in my own language on the portal home page
  • Directory of OSUGroups, with geographic representation (be able to view the groups on a map and display also the full list of groups on a page)
  • Manage content (pages) of generic interest (to host content like how to start a group, general, policies, trademark stuff, generic icons, etc)

Per each user group:

  • allow users to add events, each group will expose its ical feed
  • show to list additional resources for the group: mailing lists, forums, wiki pages, home page, url of blogs,
  • import RSS feed from blogs to aggregate content on groups page
  • display photostreams from flickr and such on the home page

Open questions

  • is this all we need?
  • do we want to host and provide web apps for any of the local groups (mlists, blog, forums, etc)? And if yes, should these be part of the such portal?
  • can we reuse code from Ubuntu Loco portal? The code is tightly integrated in Launchpad, local teams need to be created as Launchpad Teams, it uses Launchpad as OpenID provider (bugs included). But it’s already there, it’s fairly simple and it’s a django app
  • What other tools can we use for this and do you volunteer to manage such tool?

I’m interested in your opinions: join the OpenStack International Community Team on Launchpad  to discuss this further.

Share and Enjoy:Identi.caTwitterLinkedInFSDailyFacebookGoogle BookmarksPing.fmemailTumblrdel.icio.usDiggRedditStumbleUpon

<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>
<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>

Related posts


Stef for ][ stefano maffulli, 2012. | Permalink | No comment
Post tags: , , , ,

by Stef at May 10, 2012 05:20 PM

Florian Haas

Returning to Paris for OpenStack in Action 2: Production Ready

This month, I'm thrilled to go to Paris to talk about highly-available OpenStack. The event I'm speaking at is OpenStack in Action 2: Production Ready, and it's being organized by French hosting & cloud services provider eNovance.

read more

by florian at May 10, 2012 11:30 AM

May 08, 2012

Florian Haas

Our first Cloud Bootcamp is now Sold Out

Less than two weeks after it's been announced, our inaugural Cloud Bootcamp for OpenStack™ in Wellington, New Zealand is now sold out. Our friends at Catalyst IT have put up a wait list, and we're currently working on tacking on extra days to fill the excess demand.

This will be fun.

by florian at May 08, 2012 09:44 PM

May 07, 2012

Zmanda

Great Combination for Cloud Storage: Ubuntu 12.04 + OpenStack Swift Essex Release

We are very excited to see the release of Ubuntu 12.04 LTS and OpenStack Essex, especially the Essex version of OpenStack Swift, and the brand-new Dashboard. We have not yet seen any performance review on the OpenStack Swift Essex running on Ubuntu 12.o4. The official Dashboard Demo introduced the components of System Panel and Mange Compute, without any details for the Object Store. So, we did an apple-to-apple cloud backup performance comparison between OpenStack Swift Essex on Ubuntu 12.04 LTS and  OpenStack Swift 1.46 + Ubuntu 11.10, as well as demonstrated the functionality of Object Store in the OpenStack Dashboard.

In the following, we will first report our results on some select hardware configurations of proxy and storage node on EC2. Our previous blog (Next steps with the OpenStack Advisor) provides details about these hardware configurations and we use the following four configurations as the example implementations of a “small-scale” Swift cloud.

  • 1 Large Instance based proxy node: 5 Small Instance based storage nodes
  • 1 XL Instance based proxy node: 5 Small Instance based storage nodes
  • 1 CPU XL Instance based proxy node: 5 Small Instance based storage nodes
  • 1 Quad Instance based proxy node: 5 Medium Instance based storage nodes

The Large, XL, CPU XL and Quad instances cover a wide range of CPU and memory selections. For network I/O, Large, XL and CPU XL instances are provisioned with a Gigabit Ethernet (100~120MB/s), while the Quad instance offers 10 Gigabit Ethernet (~1.20GB/s) connectivity.

Again, we use Amanda Enterprise as our application to backup and recover a 10GB data file to/from the Swift cloud to test its write and read throughput respectively. We ensure that one Amanda Enterprise server can fully load the Swift cloud in all cases.

Two systems involved in the comparison are: (1) Ubuntu 11.10 + OpenStack Swift 1.4.6; (2) Ubuntu 12.04 LTS + OpenStack Swift Essex (Configuration parameters of OS, OpenStack and Amanda Enterprise are identical).  In the following, we use 11.10+1.46 and 12.04+Essex as the labels to represent the above two systems.

(1) Proxy node runs on the Large instance and 5 storage nodes run on the Small instances. (Note that the throughput values on y-axis are not plotted from zero)

(2) Proxy node runs on the XL instance and 5 storage nodes run on the Small instances.

(3) Proxy node runs on the CPU XL instance and 5 storage nodes run on the Small instances.

(4) Proxy node runs on the Quad instance and 5 storage nodes run on the Medium instances.

From the above comparisons, we found out 12.04 + Essex performs better than 11.10+1.4.6 in terms of the backup throughput, and the performance gap ranges from 2% - 20% with the average of 9.7%.  For recovery throughput, the average speedup over 11.10+1.4.6 is not as significant as the backup throughput.

We did not dig into as to who (Ubuntu 12.04 LTS or OpenStack Essex) is the cause of this slight improvement on throughput. But we can see that the overall combination performs statistically better. From our initial testing, based on the performance improvements as well as feature improvements, we encourage anyone who is running OpenStack Swift on Ubuntu to upgrade to the latest released versions to take advantages of their new updates. Five years support for 12.04 LTS is a great assurance to maximize ROI for your cloud storage implementation.

Next, we demonstrate the functionality of Object Store within the OpenStack Dashboard.

After we log into the Dashboard and click the “Project” Tab on the left and then the “Containers” under the “Object Store”, we see screen as below:

We can create a container by clicking “Create Container” button and we see the following screen:

After creating a container, we can click the container name and browse the objects associated with that container. Initially, a newly-created container is empty.

We can upload an object to the container by clicking “Upload Object” button:

Meanwhile, we can delete an object from the container by choosing the “Delete Object” from its corresponding drop-down list at the “Actions” column.

Also, we can choose to delete a container by  choosing the “Delete Container” from its corresponding drop-down list at the “Actions” column.

Here, we demonstrate the core functionality of Object Store in OpenStack Dashboard and from the above screenshots, we can observe that Dashboard provides very neat and friendly user interfaces to mange the containers and objects. This saves lot of time to look up command-line syntax for basic functionality.

Congratulations, Ubuntu and OpenStack teams!  The Ubuntu 12.04 + OpenStack Swift Essex Release combination is a great contribution to Open Source and Cloud Storage communities!

by ning at May 07, 2012 08:48 PM

Doug Hellmann

cliff -- Command Line Interface Formulation Framework -- version 0.5


cliff is a framework for building command line programs. It uses
setuptools entry points to provide subcommands, output formatters, and
other extensions.

What's New In This Release?

  • Asking for help about a command by prefix lists all matching
    commands.
  • Add formatters for HTML, JSON, and YAML.

Documentation

Documentation for cliff is hosted on readthedocs.org

Installation

Use pip:


$ pip install cliff

See the installation guide for more details.


by Doug Hellmann (noreply@blogger.com) at May 07, 2012 11:49 AM

May 04, 2012

Zmanda

Building an OpenStack Swift Cloud: Mapping EC2 to Physical hardware

As we mentioned in an earlier blog that it may seem ironical that we are using a public compute cloud to come up with an optimized private storage cloud. But ready availability of diverse type of EC2 based VMs, makes AWS a great platform for running the Sampling and Profiling phases of the OpenStack Swift Advisor.

After an optimized Swift Cloud is profiled and designed on the virtualized hardware (for example, EC2 instances in our lab), the cloud builders will eventually want to build it on the physical hardware. The question is: how to preserve the cost-effectiveness and guaranteed throughput of the Swift Cloud on the new physical hardware with new data center parameters?

A straightforward answer is to keep the same hardware and software resources in the new hosts. But, the challenge is:  EC2 (this challenge remains if other cloud compute platforms, e.g. OpenStack Compute were used for profiling) provisions the CPU resource for each type of instance in terms of “EC2 Compute Unit”‘, e.g. Large instance has 4 EC2 Compute Units, Quad instance has 33.5 EC2 Compute Units. The question is: how to translate the 33.5 EC2 Compute Units into GHz when you purchase the physical CPUs on the market for the servers? Another ambiguous resource definition associated with EC2 is the network bandwidth. EC2 has 4 standards of network bandwidth: Low, Moderate, High and Very High and for example, EC2 allocates Low bandwidth to Micro instance and Moderate bandwidth to Small instance. But, what does “Low bandwidth” means in terms of MB/s? EC2 specs provide no answers for those.

Here we want to propose a method to translate these ambiguous resource definitions (e.g. EC2 Compute Units) into the standard specifications (e.g. GHz) that can be referred when choosing the physical hardware. We focus on 3 types of hardware resources: CPU, disk and network bandwidth.

CPU: We first choose a CPU benchmark software (e.g. PassMark) and run it on a certain type of EC2 instance to get a benchmark score. Then, we look up the published benchmark scores of that benchmark software to find out which physical CPU got the similar score. For safety, we can choose the physical CPU with a little higher score to ensure it performs no worse than the virtualized CPU in the EC2 instance.

Disk: We roughly assume the I/O patterns in storage nodes are close to sequential, and we can use the “dd” Linux command to benchmark the sequential read and write I/O bandwidths on a certain type of EC2 instance. Based on the I/O bandwidth results in terms of the MB/s, cloud builders can buy the physical storage drives with the matching I/O bandwidths.

Network: To test the maximum bandwidth of a certain EC2 instance within the Swift Cloud, we setup another EC2 instance with very high network bandwidth. e.g. the EC2 Quad instance. First, we install Apache and create a test file (the size of the file depends on the memory size, as discussed later) on both EC2 instances. Then, in order to benchmark the maximum incoming network bandwidth of the EC2 instance, we issue wget command on that EC2 instance to download the test file hosted on the Quad instance. The wget command will give the average network bandwidth after the download is finished and we will use it as the maximum incoming bandwidth. To test the maximum oncoming network bandwidth, we operate the above test in the reversed direction: the Quad instance downloads the test file from the EC2 instance we want to benchmark. The reason we choose wget (instead of e.g. scp) is that wget involves less CPU overhead. Notice that, to remove the interference from the disk I/Os, we ensure the test file can fit into the memory of the EC2 instance so that there are no read I/Os needed. Also, we always execute the wget with “-O /dev/null” to bypass the write I/Os. Once we get the maximum incoming and oncoming network bandwidths, we can choose the right Ethernet components to provision the storage and proxy nodes.

Memory: As to the virtualized memory in EC2 instance, if 10 GB memory is associated with the instance, then it is straightforward to provision 10GB memory in the physical server. So, we feel that there is no translation needed for virtualized memory.

Other cloud management platforms may offer several types of instances (e.g. large, medium, small) based on their own terminologies. We can use the similar methods as above to benchmark each type of instances they offer and find the matching physical hardware.

To fully ensure that the throughput of the Swift Cloud while mapping from the EC2 instances, we advise the cloud builders to provision the physical hardware with at least 10% better specs than deduced by above translation.

Here, we show an example of how to map an EC2 c1.xlarge instance to physical hardware:

CPU: We run Pass Mark CPU benchmark on c1.xlarge. The CPU score from PassMark is: 7295. Considering to provision 10% more resource when translating from virtualized hardware to physical hardware, some choices on physical CPU include: Intel Xeon E3-1245 @ 3.30 GHz, Intel Xeon L5640 @ 2.27GHz, Intel Xeon E5649 @ 2.53 GHz etc.

Memory: As c1.xlarge instance is allocated 7GB memory,  so we could choose 8GB memory (4GB x 2 or 2GB x 4) in the physical machine.

Disk: By using the “dd” command, we found out c1.xlarge instance has 100-120 MB/s for sequential read and 70-80MB/s for sequential write, which matches to a typical 7,200 RPM based drive. Therefore, most HDDs on the market can be safe to use as data disks in the physical machine.

Network: c1.xlarge instance has around 100 MB/s network bandwidth for both incoming and outgoing traffic, which corresponds to a 1Gigabit Ethernet interface. So, a typical 1Gigabit Ethernet should be enough for networking for the physical machine.

If you are thinking of putting together a storage cloud service, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com

by ning at May 04, 2012 10:26 PM

OpenStack Blog

Community Weekly Review (Apr 27-May 4)

OpenStack Community Newsletter – May 4, 2012

HIGHLIGHTS

Upcoming Events

Other news

Community Statistics

This week’s chart shows the source of visits to http://forums.openstack.org from February 1st to May 1st

Location of Visits to OpenStack Forums from Feb 1st to May 1st 2012

The weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at May 04, 2012 09:53 PM

May 03, 2012

Stefano Maffulli

In search of a modern way to hold discussions online

The OpenStack community decided at the Design Summit to create new lists and consolidate all of them on a new service so I decided to lay down the specifications for the new system following the desiderata from developers and users. The basic need is to allow developers to discuss freely using the tool they prefer (email clients, in this case). I as community manager  need also to be able to measure discussions and allow easily for new developers and users to join the conversations. The desiderata for the messaging system are:

  • Must use email messages as primary mean of communication
  • Must allow tagging/topics for easy inbox filtering
  • Must be easy to manage (dealing with spam, delivery, moderation, etc)
  • Must have good looking archives, skinnable, with search capabilities and SEO friendly
  • Must allow measuring activity, natively or with tools like mlstats
  • Nice to have:
  • SSO integration with OpenID and more
  • Post new message (reply or start a new thread) via web
  • Offer archive via RSS

With these in mind I started looking into Mailman, the typical answer for mailing list management. The software is known and solid although the latest stable release is old.  Mailman 2 has the advantage of familiarity: we know how it works and its limitations. Mainly I know the limitations: the web UI is scary and I can’t find a way to teach Firefox’s password manager to save the passwords for each list (it associates the password to the domain, not the full url, so I can only have one password associated to lists.openstack.org –am I doing something wrong?), the default archives are also ugly and primitive forcing us to use other archivers, like Markmail.

Mailman 3, the upcoming release, is … not there yet. I could only see mockups for the new web management UI  (called Postorius, is a django app, a client of the new Mailman REST API) and it seems that Mailman 3.0 will be shipped without an official archiver.

I looked at Sympa as an alternative to Mailman, since Rackspace uses it internally. It has most of the features we need, including the nice-to-have but it seems to be lacking the Topics (although, we should say that we’re not using the Mailman topics feature at the moment anyway). I don’t think that mlstats supports Sympa and I’m not sure about its tracking capabilities (but it stores lots of metadata in a SQL database so it shouldn’t be too hard to get information from it).

Since the mailing list archives I know all look too ugly, I expanded my search to forum software hoping that in the past years there was some progress in it. The only new thing I found is Vanilla Forums, a GPLv2 forum engine. It mixes features of the old bulletin boards format with the newer question/answer concept, embracing tags and categories. The first page of Vanilla has more meaningful content than the silly topics seen in most bulletin boards and in general I found Vanilla having a better UI than most forum software. The hosted version of Vanillaforums sports also a nice integration with email but there are no plans to release such feature under the GPLv2. The view of a thread with many responses is not exciting though: it has no hierarchy, failing at readability like all bb/forum software … and I think it’s a crucial feature that enables following long discussions. This deserves more thoughts: I know Twitter gave up trying to represent threaded discussions in a single page (but Twitter was never meant for discussions), identi.ca used to have the conversation view with grades of colors but got rid of it, Gmail doesn’t bother either and shows conversations as flat, time based sequence of messages: is it just too difficult to thread discussions like any email client used to do or what else is going on?

To me it seems that Mailman is still the best we can do at the moment even if it leaves me in a pretty sad state, stuck in 2001. I would start looking into Mailman 3 and expand the search to an archiver that we can host (like CSLA) but Mailman 2 is probably the best we have at the moment. Other thoughts?

Share and Enjoy:Identi.caTwitterLinkedInFSDailyFacebookGoogle BookmarksPing.fmemailTumblrdel.icio.usDiggRedditStumbleUpon

<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>
<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>

Related posts


Stef for ][ stefano maffulli, 2012. | Permalink | No comment
Post tags: , , , , , ,

by Stef at May 03, 2012 08:41 PM

May 01, 2012

Florian Haas

More details on OSCON 2012, and your chance to get in cheaper!

A few more details on my speaking slot at this year's OSCON, titled Highly Available Cloud: OpenStack Integration with Pacemaker.

read more

by florian at May 01, 2012 06:29 PM

April 27, 2012

Yong Sheng Gong

OpenStack nova-scheduler and its algorithm

Abstract

Among the current core projects of OpenStack, Nova project is the core of the cores. Just as described in OpenStack website, Nova is a cloud computing fabric controller, the main part of an IaaS system. There are more than 20 binaries in OpenStack nova project. Among them, nova-scheduler is responsible to decide which compute node host should launch an image instance (server in terms of OpenStack) among other responsibilities. This article describes the way this component does its job together with other components and how it makes decisions faced with more than one compute node host and one instantiation request.

Overview

Just as said on the OpenStack website, OpenStack's mission is to produce the ubiquitous Open Source Cloud Computing platform that will meet the needs of public and private clouds regardless of size, by being simple to implement and massively scalable. In such cloud environments, there is often more than one compute node to instantiate image instances on. How to manage and measure these compute nodes is a very prominent problem. Based on some data, how to react to one user's request is a hot spot as well.

Let first look at where the nova-scheduler works in big picture.
image
 

Note: This figure is from http://ken.pepple.info/openstack/2012/02/21/revisit-openstack-architecture-diablo/.

Just as shown by above figure, nova-scheduler interacts with other components through queue and central database repo. For scheduling, queue is the essential communications hub.

All compute nodes (also known as hosts in terms of OpenStack) periodically publish their status, resources available and hardware capabilities to nova-scheduler through the queue. nova-scheduler then collects this data and uses it to make decisions when a request comes in.

image 

Note: this picture comes from nova project.

Above picture shows us the general idea of how the scheduler does its main job. The whole process divides into two phases. Filtering phase will generate a list of suitable hosts by applying filters. Weighting phase will sort the hosts according to their weighted cost scores, which are given by applying some cost functions. The sorted list of hosts is candidates to fulfill the user's request. How many hosts in this list will be used depends on the number of instances requested in one request.

Following this overview, the rest contents of this article will describe:

1.What an instantiation request looks like and how it goes to nova-scheduler;

2.What are the main components in nova-scheduler;

3.How the nova-scheduler components collaborate to finish the scheduling work.

Scheduler Invocation

To depict the work of nova-scheduler, we have to talk a little about nova-api first. Just as with other nova binaries, nova-api is a WSGI server. Python Routes is adopted to map RESTful URL into internal Controller's method.

image 

We can know from above figure that the main methods involved for nova-api to response to the image instantiation request. First the HTTP request is mapped to Controller's create() method. This method processes the request body and then invokes compute_api's create() method, and then method _create_instance() is called. At last the _schedule_run_instance() method will call rpc_method() to send out message onto message queue for nova-scheduler.

The rpc_method is called as follows:

rpc_method(context,

FLAGS.scheduler_topic,

{"method": "run_instance",

"args": {"topic": FLAGS.compute_topic,

"request_spec": request_spec,

"admin_password": admin_password,

"injected_files": injected_files,

"requested_networks": requested_networks,

"is_first_time": True,

"filter_properties": filter_properties}})

Of the RPC message, Field injected_files represents files that will be injected into VM disk image, while requested_networks is for network information, such as which network(s) will be used by the instance(s).

Another two parts of this message, request_spec and filter_properties, need further explanation. We will use following command to generate sample values in following sections:

nova boot --image a3fb743d-42df-49ba-b9c4-8042ebbd344e --flavor 1 myserver --hint test=testvalue –availability_zone=myzone::testhost

filter_properties

The filter_properties part of RPC message is to help nova-scheduler. Normally, it will contain scheduler_hints information from user request. We have sample content like this from previous nova boot command:

filter_properties:{

'scheduler_hints': {

'force_hosts': [u'testhost'],

u'test': u'testvalue'

}

}

If our availability_zone complies with the pattern zone:xx:host, force_hosts field will be in scheduler_hints. The value of force_hosts can target the request to a given host directly before going through scheduler's filters. Also there can be ignore_hosts in scheduler_hints, which means the specified hosts will be skipped during scheduling. Both force_hosts and ignore_hosts are applied before going through filters. Please see following section Inside of FilterScheduler.

request_spec

The requst_spec part of RPC message is encapsulation or normalization of HTTP request.

Below table is the sample content of request_spec in the RPC message by previous nova boot command.

{

'num_instances': 1,

'block_device_mapping': [],

'image': {

'status': 'active',

'name': 'cirros_blank',

'deleted': False,

'container_format': 'ami',

'created_at': '2012-04-05 14:26:24',

'disk_format': 'ami',

'updated_at': '2012-04-05 14:26:25',

'properties': {

'kernel_id': '46bf134e-2e6e-472a-a159-f4cd51f36d84',

'ramdisk_id': '106dc550-783e-4de7-951d-f4f3d5427698'

},

'min_ram': '0',

'checksum': '2f81976cae15c16ef0010c51e3a6c163',

'min_disk': '0',

'is_public': True,

'deleted_at': None,

'id': 'a3fb743d-42df-49ba-b9c4-8042ebbd344e',

'size': 25165824

},

'instance_type': {

'root_gb': 0L,

'name': u 'm1.tiny',

'deleted': False,

'created_at': None,

'ephemeral_gb': 0L,

'updated_at': None,

'memory_mb': 512L,

'vcpus': 1L,

'flavorid': u '1',

'swap': 0L,

'rxtx_factor': 1.0,

'extra_specs': {},

'deleted_at': None,

'vcpu_weight': None,

'id': 2L

},

'instance_properties': { # used to comsume virtual resources

'vm_state': 'building',

'ephemeral_gb': 0L,

'access_ip_v6': None,

'access_ip_v4': None,

'kernel_id': '46bf134e-2e6e-472a-a159-f4cd51f36d84',

'key_name': None,

'ramdisk_id': '106dc550-783e-4de7-951d-f4f3d5427698',

'instance_type_id': 2L,

'user_data': '',

'vm_mode': None,

'display_name': u 'myserver',

'config_drive_id': '',

'reservation_id': 'r-bdbnl7aa',

'key_data': None,

'root_gb': 0L,

'user_id': u '81ced34d11954800906096555539c885',

'uuid': u '4ccc7c93-cbde-4233-a7cb-5db81f82489b',

'root_device_name': None,

'availability_zone': u 'myzone', # default to FLAGS.default_schedule_zone

'launch_time': '2012-04-11T15:08:55Z',

'metadata': {},

'display_description': u 'myserver',

'memory_mb': 512L,

'launch_index': 0,

'vcpus': 1L,

'locked': False,

'image_ref': u 'a3fb743d-42df-49ba-b9c4-8042ebbd344e',

'architecture': None,

'power_state': 0,

'auto_disk_config': None,

'progress': 0,

'os_type': None,

'project_id': u '9d049e4b60b64716978ab415e6fbd5c0',

'config_drive': ''

},

'security_group': ['default']

}


Some values in instance_properties are copied from instance_type. Both copies of these values play a role in scheduler's work. Red colored parts are important to nova scheduler. Most of them can be used in filters and cost functions.

Nova-scheduler class diagram

image 

Before diving into how the nova-scheduler deals with the instantiation request message, we had better have a look at the data structure it used.

Just as shown by above figure, many classes or modules work together in nova-scheduler:

1.SchedulerManager sits between the queue and the other nova-scheduler components. It receives requests from queue and delegates jobs to its driver. The driver is defined by configuration option FLAGS.scheduler_driver with "nova.scheduler.multi.MultiScheduler" as default value.

2.Scheduler, parent class for all other schedulers, has compute_api and host_manager attributes. The value of compute_api is nova.compute.api.API. The value of host_manager is defined by FLAGS.scheduler_host_manager with "nova.scheduler.host_manager.HostManager" as default value.

3.MultiScheduler is a subclass of Scheduler designed to delegate to configurable drivers per resource type (compute and volume today). Value of compute_driver is defined by configuration option FLAGS.compute_scheduler_driver with default value "nova.scheduler.filter_scheduler.FilterScheduler". Value of volume_driver is not within the article's scope.

4.FilterScheduler is responsible for selecting hosts and provisioning resources. It chooses the host by applying filters and calculate
s weighted cost. Host which passes filters and has least cost wins.

5.ChanceScheduler chooses the host randomly from running hosts


6.SimpleScheduler chooses the host based on the running cores. Host with least running cores wins out.


7.API is compute API, used to call API service of OpenStack compute.


8.HostManager is for collect
ing and saving host data.

9.Module least_cost contains the
cost function and WeightedHost class.

10.WeightedHost is a value object, with weight and hoststate as two fields
.

11.HostState records host's capabilit
ies and virtual consumptions of resources during one request processing.

Inside of nova-scheduler

image When the request is sent out in the form of RPC message, it will be received by nova-scheduler, which will call SchedulerManager's run_instance() method. Following the calling chain, the control will arrive at FilterScheduler's scheduler_run_instance() method in the end.

Inside of FilterScheduler

By default, compute related scheduling will come to this class. Its schedule_run_instance() method will take the control to fulfill user request.

 image
 Above diagram shows us the main tasks done by schedule_run_instance method:
1. _schedule()

This method selects some hosts to instantiate the image. By design, the instantiation request can be to create more than one instance. So it will return a sorted list of WeightedHosts by the weight. Least weight comes first. Also this function will populate filter_properties with more data, such as request_spec, config_options and instance_type, etc before calling filters and cost functions.

1.1 get_cost_function()


###### (FloatOpt) How much weight to give the fill-first cost function. A negative value will reverse behavior: e.g. spread-first

compute_fill_first_cost_fn_weight=-1.0

###### (ListOpt) Which cost functions the LeastCostScheduler should use

least_cost_functions="nova.scheduler.least_cost.compute_fill_first_cost_fn"

###### (FloatOpt) How much weight to give the noop cost function

 FilterScheduler's cost functions are organized in a list of tuples of weight and cost function. Functions are defined by FLAGS. least_cost_functions, and corresponding weights are defined in separated options. For example, in above configuration fragment, compute_fill_first_cost_fn_weight defines weight for default function nova.scheduler.least_cost.compute_fill_first_cost_fn. As of writing, this function just return free RAM of a HostState:

def compute_fill_first_cost_fn(host_state, weighing_properties):

"""More free ram = higher weight. So servers will less free

ram will be preferred."""

return host_state.free_ram_mb


1.2 get_all_host_states()

It returns a dict of all the hosts the HostManager knows about. Also, each of the consumable resources in HostState is pre-populated and adjusted based on capabilities data of HostManager. A sample of the returned dict looks like {"host1":hoststate for host1, "host2":hoststate for host2,...}. Please see later sections for hoststate.

 

1.3 filter_host()

 This function takes a list of HostStates and filter_properties as parameters and returns those which can pass the filters.
Filters allowed are defined by FLAGS.scheduler_available_filters with "nova.scheduler.filters.standard_filters" as default value. In fact, it will traverse the filter path and return a list of filter classes. As of now, the list is such as:

'nova.scheduler.filters.isolated_hosts_filter.IsolatedHostsFilter'

'nova.scheduler.filters.compute_filter.ComputeFilter'

'nova.scheduler.filters.availability_zone_filter.AvailabilityZoneFilter'

'nova.scheduler.filters.ram_filter.RamFilter'

'nova.scheduler.filters.json_filter.JsonFilter'

'nova.scheduler.filters.all_hosts_filter.AllHostsFilter'

'nova.scheduler.filters.core_filter.CoreFilter'

'nova.scheduler.filters.affinity_filter.AffinityFilter'

'nova.scheduler.filters.affinity_filter.DifferentHostFilter'

'nova.scheduler.filters.affinity_filter.SameHostFilter'

'nova.scheduler.filters.affinity_filter.SimpleCIDRAffinityFilter

 
 For how each filter filters host, please see FilterScheduler development reference.

Filters used are defined by FLAGS.scheduler_default_filters with "AvailabilityZoneFilter,RamFilter,ComputeFilter" as default value.

Each Filter has defined a host_passes() function which receives HostState and filter_properties as parameters and returns bool to indicate if the host specified in HostState is a good candidate for this filter.

1.4  Passes_filters()

With each HostState object, filter_host() method will call its passes_filters() to check if the host can pass all filters defined. Before going through filters, this method checks if the host complies with rules defined by force_hosts and ignore_hosts fields of scheduler_hints. If field ignore_hosts exists and the host represented by the HostState is in the list, the host fails. If field force_hosts exists, whether the host represented by the HostState object passes depends on if it is in force_hosts. After these rules, if not filtered out, the host will go through the filters until one of the filters fails. If all filters pass, the host will be ok to next phase-cost weighting.

1.5 Weighted_sum()

 With cost_functions returned by get_cost_function(), HostStates returned by filter_host(), and filter_properties as parameters, this function will first score each host by running each cost function to generate a grid kind of like the table below:


fn#1

fn#2

fn#n

Host1

Score#1_1

Score#1_2

Score#1_n

Host2

Score#2_1

Score#2_2

Score#2_n

Hostn

Score#n_1

Score#n_2

Score#n_

 And then it will calculate the weighted scores for each host by multiplying score and weight of each cost function to generate a list of weighted final scores. The formula used is:

Final score of a certain host = ∑(weight of cost function * score returned by this function for the host)

 And then it will associate the scores with HostStates to generate a list of tuples of score and HostState.

Last it will sort the tuples and return a WeightedHost using the first tuple. This way the least cost host will win.

 

1.6 getHostState()

It returns the HostState object from selected WeightedHost

1.7 consume_from_instance()

It takes instance_properties as parameter, which comes from request_spec. This function adjusts HostState object's data to virtually consume the resources so that the HostState object can enter into the next loop of host choosing for this request's next instance.

2 _provision_resource()

This function creates requested resource, such as an image instance.

2.1 cast_to_compute_host()

 This function casts a queue message to target host so that it will create an image instance according to the request_spec.

Scheduler’s intelligent data - Host Capabilities

image Host capability data is another important input for the scheduler. Every OpenStack service, such as nova-compute or nova-network, can publish its capabilities. Above figure depicts how the compute host updates its capabilities of compute service and publishes them to nova scheduler, and how the nova scheduler saves this data for later use. Roughly, the whole process splits into three parts:
1. To collect capabilities

ComputeManager's _report_driver_status() method is a periodic task, which calls update_service_capabilityes() to update the capabilities. LibvirtConnection (There are other connection types. Which one to use depends on the configuration in nova.conf and is usually hypervisor dependent) is the one that does actual job. Its method get_host_stats() is used to collect host capabilities.

One sample of capabilities data looks like:

{

u 'disk_available': 226,

u 'cpu_info': {

u 'vendor': u 'Intel',

u 'model': u 'Westmere',

u 'arch': u 'x86_64',

u 'features': [u 'rdtscp', u 'x2apic', u 'xtpr', u 'tm2', u 'est', u 'vmx', u 'ds_cpl', u 'monitor', u 'pbe', u 'tm', u 'ht', u 'ss', u 'acpi', u 'ds', u 'vme'],

u 'topology': {

u 'cores': u '2',

u 'threads': u '2',

u 'sockets': u '1'

}

},

u 'hypervisor_type': u 'QEMU',

u 'vcpus_used': 0,

u 'vcpus': 4,

u 'host_memory_free': 1718,

u 'disk_total': 375,

u 'host_memory_total': 3845,

u 'hypervisor_version': 15000,

u 'disk_used': 149

}


2. To publish capabilities

The method publish_service_capabilities() is another periodic task of ComputerManager. It will delegate scheduler api to send out the capabilities onto message queue. The message has topic 'scheduler', service name 'compute' and hostname besides the capabilities.

3. To Receive Capabilities

When the message is on the queue, nova-scheduler will get it and call the SchedulerManager. And then it will call Scheduler's update_service_capablilites() method, which will invoke the HostManager's update_service_capablilites() method. After that the capabilities data of that given service for that given host will be saved by HostManager until next update.

Summary

To wrap up, as a cloud scales to two hosts, the scheduler plays a role. More hosts there are in a cloud, more important the scheduler is. Among all the inputs to nova-scheduler, three are important. They are configuration in nova.conf, service capabilities of each host and the request spec. The configuration in nova.conf decides the static and run-time class structure, service capabilities works as base intelligent data and request spec is the service target. Nova-scheduler can schedule to certain hosts and skip some hosts according to request spec. In addition to the hosts specified in request spec, zone concept can also help scheduler to distribute requests to zone member hosts. After knowing the inside, default behavior of nova-scheduler can be easily modified in nova.conf.

Resources

1. OpenStack wiki website

2. OpenStack Nova GitHub website

3. Revisiting OpenStack Architecture: Essex Edition

4. FilterScheduler development reference

 

by YongShengGong at April 27, 2012 11:12 PM

OpenStack Blog

Community Weekly Review (Apr 20-27)

OpenStack Community Newsletter – April 27, 2012

Welcome back to our regular publishing schedule. This week we still hear the echo of the Design Summit and Conference.

HIGHLIGHTS

Upcoming Events

Other news

Community Statistics

This week’s chart shows the geographical dispersion of participants to Folsom series of events in San Francisco. The information is derived from the work address provided by participants when they registered. Participants from USA were the large majority, around 70% of the over 1,000 participants, nonetheless it’s interesting to look at the distribution once the outlier is removed.

Participants to Folsom Design Summit and Conference, per nation (excluding USA)

This weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at April 27, 2012 09:50 PM

Stefano Maffulli

Proud member of an effective community

In days like this I feel very lucky to be a member of the OpenStack community. The thread that started today demonstrates the great culture of collaboration among the people that make OpenStack.

Following up on the discussion held at the Design Summit, Thierry suggested to split the Mailing-list in order to improve the communication among developers and users. The proposal requires a significant change in the current workflow for developers and adds a new burden on the infrastructure team.  A reason to be proud of this community is that the infrastructure team didn’t say ‘no’ like many IT shops would. They highlighted what they needed in order to do a good job satisfying the request. A volunteer jumped up to help (thank you Duncan) and off we go to do something without wasting time debating. This community is effective and gets things done.

Share and Enjoy:Identi.caTwitterLinkedInFSDailyFacebookGoogle BookmarksPing.fmemailTumblrdel.icio.usDiggRedditStumbleUpon

<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>
<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>

Related posts


Stef for ][ stefano maffulli, 2012. | Permalink | No comment
Post tags: ,

by Stef at April 27, 2012 05:27 PM

April 26, 2012

Stefano Maffulli

Back from OpenStack Design Summit and Conference

What a week! We had over 1,000 participants from 26 countries (Japan and UK follow USA for total number of participants with Australia, Canada and South Korea in the next cluster), 159 sessions during the summit and 56 during the conference, over 40 hours of intense discussions, decisions and night time fun a total with 8 parties. No wonder some (me first) had to take a break to recover from it :)

I led two sessions about community and participated in another few. All have actionable items. Below are the notes I took. In the next weeks I’d like to gather comments and proceed with implementation.

How to track contributions to the community  (etherpad notes)

I showed the results of the work I’ve been doing with the metrics about OpenStack development. The plan is to build a datawarehouse that hosts data from git and the bug tracker, plus data from forums, mailing lists, IRC and gerrit. Currently the datawarehouse hosts data extracted from git. A couple of days before the summit the developers of bicho released the extension  to gather data from Launchpad bug tracker  (the job was sponsored by Rackspace). Mark McLoughlin used git-dm to gather data from OpenStack git repository, creating a good master data record for the association developer-company.

Many people liked the analysis done after the Essex release, while the weekly reports are less interesting. The other important comment is that we should all be careful with which  numbers we decide to track as these can lead to “games” to trick the system.

I gathered that it would be more  interesting to have monthly reports and more comprehensive coverage of what is going on with the project. Thankfully, Jake Dahn is working on a self service portal as a frontend to the datawarehouse (his work is on github).

Communication – IRC, mailing lists, blogs, forums (etherpad notes)

General agreement is that the developers mailing list on Launchpad has too much traffic (see the chart http://openstack.markmail.org/). The plan is to keep the Launchpad list for general discussions and move developers to a new set of lists, managed by a powerful list manager that allows automatic tagging of messages, per project. Ideally, a developer would send a message about Quantum to something like openstack-dev-quantum@listmanager with subject “foo bar”. The message would have its subject changed to “[Quantum] foo bar” and it would be delivered to openstack-dev@listmanager. Even if mailman and other list managers have this feature, implementing and running a large email server like this would require resources. The openstack-infra team would help manage the machines but the actual email server administration should be done by somebody else. We need volunteers.

The IRC channels seem to be suffering: #openstack and #openstack-dev are not helping create a sense of cohesion among developers and users. The suggestion is to keep #openstack and split #openstack-dev into smaller channels for each project. The idea is that users and developers will all hang around #openstack, finding questions to answers while the smaller project-dedicated channels will help create bonds among developers.

How to help users get answers to their questions is the main issue we discussed in the section. Currently questions come in from messages to the mailing lists operators and developers, on forums, IRC channel, Linkedin group, Disqus on docs.openstack.org and more. The general feeling is that Launchpad Answers product is not adequate for the task, mainly because it’s hard to measure anything on it and since there are now six projects where you can ask question, and sometimes the answer is related to two projects not just one.  The OpenStack forums seem to be gaining traction and should be advertised more prominently from openstack.org domain. We should investigate the possibility to use something similar to StackOverflow.

The OpenStack Planet suffers for having little visibility but, like the forums, it has lots of good content. Also, lots of content that could be in the planet, stays out because to add a blog you need to be a developer (to add your blog to the planet you need to use git and gerrit). There was consensus on adding a plugin to the OpenStack blog to manage syndication of content directly from openstack.org/blog and get rid of planet.

Panel: Expanding the Community

I moderated the panel with Boris Renski,  Masanori Itoh, Joseph B George, Tristan Goode, Yoyo Chiang and important contribution from Heng Chui and Sammy Luo from China. There will soon be a video online but the gist of the session is that we need to add more content in local languages to openstack.org, and we need to publish and keep updating the document OpenStack User Group HOWTO. Other comments were that starting a new user group is easy, doesn’t need bureaucracy and it can help companies build a local reputation as OpenStack experts.

I used a few slides to introduce the topic Growing the OpenStack International Community: <iframe frameborder="0" height="355" marginheight="0" marginwidth="0" scrolling="no" src="http://www.slideshare.net/slideshow/embed_code/12624301" width="425"></iframe>

Share and Enjoy:Identi.caTwitterLinkedInFSDailyFacebookGoogle BookmarksPing.fmemailTumblrdel.icio.usDiggRedditStumbleUpon

<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>
<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>

Related posts


Stef for ][ stefano maffulli, 2012. | Permalink | No comment
Post tags: , , , , ,

by Stef at April 26, 2012 07:14 PM

April 25, 2012

Duncan McGreggor

New Life: The OpenStack DevOps Community

Last week's OpenStack Design Summit and Conference were pretty fantastic events. Lots of great technical discussions, some good initial planning on large tasks, incredible numbers of conversations -- all of high quality! Looking back, though, I'd have to say that the high point of the event for me was what turned into the DevOps community brainstorm session (etherpad link) at the Design Summit (all etherpads are here).

The session was approved with the title of "OpenStack and Operations: Getting Real" with a major goal of deciding whether we needed to create a new top-level project for DevOps. So it was really "New DevOps Team?" However, all that changed once we got started, and there was some amazing feedback and enthusiasm in that room. By the end of it, we were all pretty pumped up and ready to jump in with all our hands and feet!

The highest-level summary for me was this: The DevOps folks in the OpenStack community really need a point to rally around. Someplace they can not only talk to each other, share stories, get advice, etc., but also where they can have their voices heard by the predominantly developer-oriented OpenStack community. Jay Pipes suggested that a new section be added to the weekly IRC team leader meetings, and as Nova Ops subteam lead (teams list), I volunteered to collect top issues from the DevOps (sub-)community and give these some air-time in the meetings.

Some other highlights from the session:

  • trystack.org needs volunteer admins -- a great opportunity for improving the dev <-> ops interface
  • CERN is deeply invested in DevOps, and wants to share data and possibly help with defining reference architectures
  • The same goes for the University of Melbourne!
  • There was a good corporate presences in the session too, rallying around a new and improved DevOps community: HP, Yahoo, AT&T, Canonical, Rackspace, DreamHost, and more (sorry if I forgot you!) 

There is a tremendous amount of information that we collected, but in written form (etherpads) as well as conversations during last week's event. As this is collected, we'll be reaching out to folks via the operators mail list, blog posts, tweets, and Google+ messages. Be sure to do the same! If you've got concerns, bring them up on the mail list, we'll get some bugs filed and blueprints put together, and come up with plans for addressing these.

Thanks again, everyone!


by Duncan McGreggor (noreply@blogger.com) at April 25, 2012 07:47 PM

Doug Hellmann

Notes from OpenStack Folsom Design Summit Spring 2012


I have published my notes from the Design Summit and Conference
kicking off the Folsom release of OpenStack, held April 16-20, 2012 in
San Francisco, CA. Much more happened than I could hope to capture in
a single post, but the notes should give a sense of the sorts of
topics discussed.

tl;dr

This was a good week for the DreamHost development team. We began
establishing ourselves as contributors in the OpenStack community, met
a lot of people we will need to be working with over the next few
months, and learned a lot and OpenStack in general. We also identified
several projects that need our help, and found some tools we should be
able to use for our implementation.

Read more...


by Doug Hellmann (noreply@blogger.com) at April 25, 2012 07:29 PM

James E. Blair

Cinder: a Success Story in Automating Project Infrastructure

OpenStack projects have gated trunks -- that is, every change to an OpenStack project must pass unit and integration tests.  Each one requires a number of Jenkins jobs to accomplish this, and some support within the project in the form of configuration files and test interfaces.  Until recently, this was managed in an ad-hoc manner, but as we add projects and tests, it won't scale.  We currently have 235 Jenkins jobs, and that's way too many to manage manually.

Enter the standardized Project Testing Interface.  It lays out all the processes for testing and distribution that a project needs to support to work with the OpenStack Jenkins system.  By standardizing this, we can start to manage Jenkins jobs collectively instead of individually.  This means that not only is it easier to add new projects, but we can be sure that existing projects benefit from improvements in the system and avoid bit-rot.

The OpenStack Common project helps ensure that the code in each project that handles project setup, dependencies, versions, etc, is kept in sync and standardized.  The Project Testing Interface depends on openstack-common for the project-side of its implementation.

Finally, the OpenStack CI team (Andrew Hutchings in particular) has been developing a system to manage our Jenkins configuration within puppet.  That's how we plan on managing groups of Jenkins jobs, and it also means that changes to the Jenkins configuration can go through code review, just like any other change to the project.  Anyone can submit changes to the running Jenkins configuration without any special administrative privileges.

All of these efforts came together this morning when we bootstrapped the Cinder project, the breakout of the volumes component from Nova.  Adding "cinder" to the list of standard python jobs in puppet caused all of our standard packaging and gating jobs to be created in Jenkins.  OpenStack Common generated the skeleton code for the project that conforms to the Project Testing Interface.  And when that code was submitted for review, it passed the automatically-created gate jobs.  From an infrastructure standpoint, Cinder went from an empty repository to a fully integrated OpenStack project in just a few minutes.

 

by James E. Blair (corvus@gnu.org) at April 25, 2012 07:14 PM

Grid Dynamics OpenStack Team

RHEL and Centos RPM packages for OpenStack Essex (2012.1) release is out

We are happy to inform you, that we have prepared Essex RPM packages for RHEL and Centos. We tested packages for compatibility with Scientific Linux as well. For Centos you should also use EPEL repository.

RHEL yum repo: http://yum.griddynamics.net/yum/essex/.

Centos yum repo: http://yum.griddynamics.net/yum/essex-centos/

Essex Packages:

  • openstack-nova-essex
  • openstack-glance-essex
  • openstack-keystone-essex
  • openstack-swift-essex
  • openstack-quantum-essex
  • python-quantumclient
  • python-novaclient-essex
Setup instructions for essex (for testing, not for production) :

http://openstack.griddynamics.com/setup_single_essex.html.

We are waiting for your questions/comments at our mailing list: openstack@griddynamics.com.


by Boris Filippov at April 25, 2012 03:50 PM

Chmouel Boudjnah

Swift integration with other OpenStack components in Essex.

During the development for OpenStack Essex a lot of work has been done to make Swift working well with the other OpenStack components, this is a list of the work that has been done.

MIDDLEWARE

To make Swift behaving well in the ‘stack’ we had to get a rock solid keystone middleware and make sure most of the features provided by Swift would be supported by the middleware.

The middleware is currently located in the keystone essex repository and was entirely rewritten from the Diablo release to allow support these Swift features :

  • ACL via keystone roles :

Allow you to map keystone roles as ACL, for example to allow a user with the keystone role ‘Reader’ to read a container the user in swift_operator_role can set this ACL :

-r:Reader container

  • Anonymous access via ACL referrer.

If a swift_operator wants to give anonymous access to a container in reading they can set this ACL :

-r:*

It basically mean you are enabling public access to the container.

  • Container syncing :

This allow to have two different container in sync, see the documentation here.

  • Different reseller prefix :

You will be able to mix different auth server on your Swift cluster, like swauth and keystone.

  • Special reseller admin account :

This is a special account whose allowed to access all account. It i used by nova for example to upload images to different accounts.

  • S3 emulation :

Allows you to connect with S3 API to Swift using swift3 and new s3_token middleware. The S3 token will simply take a S3 token to validate it in keystone and get the proper tenant/user information to Swift.

One thing missing in the middleware is to allow auth overriding, basically it means that when an another middleware wants to take care of the authentication for some request the auth middleware will just let it go and allow the request to continue. Such feature is used for example in the temp_url middleware to allow temporary access/upload to an object. This is projected to be supported in the future.

An important thing to keep in mind when you configure your roles is to have a user in a tenant (or account like called in Swift world) acting as an operator. This is controlled by the setting :

swift_operator_roles

and by default have the roles swiftoperator and admin. A user needs to have this role to be able to do something in a tenant.

GLANCE

Glance has been updated as well to be able to store images in swift which have a auth server using the 2.0 identity auth.

NOVA

Nova have the ability to access an objectstore to store images in a store which has been uploaded with the euca-upload-bundle command. Historically nova have shipped with a service called nova-objectstore but the service was buggy and had some security issues. Swift combined with keystone’s s3_token and swift3 middleware now can act as a more reliable and secure objectstore for Nova.

DEVSTACK

support Swift if you add the swift service to the ENABLED_SERVICE variable in your localrc. This is where you want to poke around to see how the configuration is made to have everything playing well together. The only bit that didn’t made for the devstack essex release is to have glance storing images directly in Swift.

CLI / Client Library

Swift CLI and client library (called swift.common.client) has been updated to support auth v2.0 the CLI support now the common OpenStack CLI arguments and environment to operate against auth server that has 2.0 identity auth.

We unfortunately were not in time to add the support for OS_AUTH_TENANT and use the Swift auth v1 syntax where if the user has the form of tenant:user OS_AUTH_TENANT will become tenant and OS_AUTH_USER the user.

Aside of a couple of bit missing we believe Swift should be rock solid to use with your other OpenStack components. There is no excuse to not use Swift as your central object storage component in OpenStack ;-) .

 


by chmouel at April 25, 2012 03:11 PM

Thierry Carrez

A community maturing

A few days after an intense and fruitful OpenStack Design summit, I just recovered enough from jet lag to deliver my impressions in written form. We put a lot of smart people into rooms to discuss various subjects hastily defined while we were busy releasing Essex… and the magic worked again: open collaboration between developers from competing companies, strong but always polite technical discussions, lots of decisions, teams of developers with common interests forming, duplication of effort avoided…

It’s clear that the format (mostly inherited from Ubuntu’s Developers Summits) works very well in our open innovation project: everybody comes with a plan that is open to modifications and the developers are empowered with decision making.  This makes the design summit sessions very appealing to developers, turning them into advocates of our development model in their companies, removing any barriers to contribution that could be left. Being part of the OpenStack community is just pleasant !

However this edition was a bit different from previous ones. There were a lot of signs that our community is maturing. With OpenStack growing, developers can no longer follow every session and give their opinion on every subject: they have to pick their fights, and trust the other developers to come up with the right design in sessions they can’t attend. So sessions had a lot less advice-giving people and a lot more people actually signing up to do work. The topics were much more deployers-oriented and much less about changing to the latest shiny stuff. Even less glamorous sessions like bug triaging, documentation, internationalization or stable branch maintenance saw a lot a participants present, and signing up to help.

People realized that OpenStack is here to stay, and that strategic contributions are necessary for it to reach the final stages of its long-term world domination plans. When did that switch happen ? A graph recently published in the community newsletter shows the change happening a few months into Essex:

As you can see, people used to care about fixing bugs in spikes around release times. But starting around November, 2011, we see the bugfixes curve starting to follow the bugreporting curve more closely.

After the Diablo release I advocated for companies to put their money where their mouth is and start contributing strategically to OpenStack. I’m happy to see that it happened during the Essex cycle, and that the awesome Design Summit we just had confirms that trend.


by Thierry Carrez at April 25, 2012 01:25 PM

Florian Haas

Coming to New Zealand!

hastexo is offering Cloud Bootcamp for OpenStack™ in Wellington. Another fine example of the global OpenStack community at work.

read more

by florian at April 25, 2012 05:37 AM

April 24, 2012

Russell Bryant

OpenStack Design Summit and an Eye on Folsom

I just spent a week in San Francisco at the OpenStack design summit and conference. It was quite an amazing week and I’m really looking forward to the Folsom development cycle.  You can find a notes from various sessions held at the design summit on the OpenStack wiki.

Essex was the first release that I contributed to.  One thing I did was add Qpid support to both Nova and Glance as an alternative to using RabbitMQ.  Beyond that, I primarily worked on vulnerability management and other bug fixing.  For Folsom, I’m planning on working on some more improvements involving the inter-service messaging layer, also referred to as the rpc API, in Nova.

1) Moving the rpc API to openstack-common

The rpc API in Nova is used for private communication between nova services. As an example, when a tenant requests that a new virtual machine instance be created (either via the EC2 API or the OpenStack compute REST API), the nova-api service sends a message to the nova-scheduler service via the rpc API.  The nova-scheduler service decides where the instance is going to live and then sends a message to that compute node’s nova-compute service via the rpc API.

The other usage of the rpc API in Nova has been for notifications.  Notifications are asynchronous messages about events that happen within the system.  They can be used for monitoring and billing, among other things.  Strictly speaking, notifications aren’t directly tied to rpc.  A Notifier is an abstraction, of which using rpc is one of the implementations.  Glance also has notifications, including a set of Notifier implementations.  The code was the same at one point but has diverged quite a bit since.

We would like to move the notifiers into openstack-common.  Moving rpc into openstack-common is a prerequisite for that, so I’m going to knock that part out.  I’ve already written a few patches in that direction.  Once the rpc API is in openstack-common, other projects will be able to make use of it.  There was discussion of Quantum using rpc at the design summit, so this will be needed for that, too.  Another benefit is that the Heat project is using a copy of Nova’s rpc API right now, but will be able to migrate over to using the version from openstack-common.

2) Versioning the rpc API interfaces

The existing rpc API is pretty lightweight and seems to work quite well.  One limitation is that there is nothing provided to help with different versions of services talking to each other.  It may work … or it may not.  If it doesn’t, the failure you get could be something obvious, or it could be something really bad and bizarre where an operation fails half-way through, leaving things in a bad state.  I’d like to clean this up.

The end goal with this effort will be to make sure that as you upgrade from Essex to Folsom, any messages originating from an Essex service can and will be correctly processed by a Folsom service.  If that fails, then the failure should be immediate and obvious that a message was rejected due to a version issue.

3) Removing database access from nova-compute

This is by far the biggest effort of the 3 described here, and I won’t be tackling this one alone.  I want to help drive it, though.  This discussion came up in the design summit session about enhancements to Nova security.  By removing direct database access from the nova-compute service, we can help reduce the potential impact if a compute node were to be compromised.  There are two main parts to this effort.

The first part is to make more efficient use of messaging by sending full objects through the rpc API instead of IDs. For example, there are many cases where the nova-api service gets an instance object from the database, does its thing, and then just sends the instance ID in rpc message.  On the other side it has to go pull that same object out of the database.  We have to go through and change all cases like this to include the full object in the message.  In addition to the security benefit, it should be more efficient, as well. This doesn’t sound too complicated, and it isn’t really, but it’s quite a bit of work as there is a lot of code that needs to be changed. There will be some snags to deal with along the way, such as dealing with making sure all of the objects can be serialized properly.

Including full objects in messages is only part of the battle here.  The nova-compute service also does database updates.  We will have to come up with a new approach for this.  It will most likely end up being a new service of some sort, that handles state changes coming from compute nodes and makes the necessary database updates according to those state changes. I haven’t fully thought through the solution to this part yet. For example, ensuring that this service maintains proper order of messages without turning into a bottleneck in the overall system will be important.

Onward!

I’m sure I’ll work on other things in Folsom, as well, but those are three that I have on my mind right now.  OpenStack is a great project and I’m excited to be a part of it!


by russellbryant at April 24, 2012 08:33 PM

Anne Gentle

Command line reference with true scrolling

I’m at the OpenStack Design Summit this week, and one of the OpenStack companies here, Piston Computing, created a pen that contains a scroll inside.

When you open the scroll, you can see all the commands available for the “nova” client, which is how you send commands to the OpenStack Compute API at the command line. Clever!

nova-pen

by annegentle at April 24, 2012 12:52 PM

Florian Haas

A look back at my first OpenStack Design Summit & Conference

I've just returned from the OpenStack Folsom Design Summit and Spring 2012 Conference, and am finally getting rid of my jet lag. Here's a summary of what's been a mind-blowing conference experience for me.

read more

by florian at April 24, 2012 09:35 AM

Ryan Lane

Per-project sudo policies using sudo-ldap and puppet

In Wikimedia Labs, we don’t manage authentication and authorization in the normal public cloud way. We don’t assume that an instance creator is managing auth for instances they create. Instead, all of Labs uses a single auth system for all projects and instances and a community manages project membership and auth.

In the original design, being a project member in specific projects would automatically give you root via sudo and being a project member in a global project would give you shell, but not root. We were handling this through puppet configuration. This was a fairly limiting system. Giving fine grained permissions wasn’t easy. The instances knew which users were a member of a project since the projects were also posix groups; however, they didn’t know which users were in the roles of that project, so there was no fined grained way to handle this.

sudo-ldap to the rescue. With sudo-ldap, we can manage sudo policies in LDAP, and those can be done in a per-project basis. Let me explain how we’re handling this while also ensuring the original assumed design still applies to old projects.

Handling the sudo policies in LDAP

To make sudo work per-project, we need to make a sudoers OU for each project. Projects are located at ou=projects,dc=wikimedia,dc=org. We have an example project at cn=testproject,ou=projects,dc=wikimedia,dc=org. We can create a new sudoers OU for this project, with a default policy (for backwards compatibility):

dn: ou=sudoers,cn=testproject,ou=projects,dc=wikimedia,dc=org
ou: sudoers
objectclass: organizationalunit
objectclass: top

dn: cn=default,ou=sudoers,cn=testproject,ou=projects,dc=wikimedia,dc=org
cn: default
objectClass: sudorole
objectClass: top
sudoCommand: ALL
sudoHost: ALL
sudoUser: ALL

The above creates a sudoers OU underneath the project’s object and creates a default policy for that project that gives all users the ability to run all commands via sudo.

For every pre-existing specific project, I created an OU and a default policy, then for every pre-existing global project I only created the OU, ensuring everything continued working how things worked in the original design. Whenever a project is created the OU and a default policy is also now automatically created with the project.

Configuring sudo on the instances

Now we must configure the instances to pull their sudo policies from this OU. Here’s the puppet template we’re using for /etc/sudo-ldap.conf:

BASE            <%= basedn %>
URI             <% servernames.each do |servername| -%>ldap://<%= servername %>:389 <% end -%>

BINDDN          cn=proxyagent,ou=profile,<%= basedn %>
BINDPW          <%= proxypass %>
SSL             start_tls
TLS_CHECKPEER   yes
TLS_REQCERT     demand
TLS_CACERTDIR   /etc/ssl/certs
TLS_CACERTFILE  /etc/ssl/certs/<%= ldap_ca %>
TLS_CACERT      /etc/ssl/certs/<%= ldap_ca %>
<% if ldapincludes.include?('sudo') then %>SUDOERS_BASE    <%= sudobasedn %><% end %>

The sudobasedn variable is being set as this:

$sudobasedn = "ou=sudoers,cn=${instanceproject},ou=projects,${basedn}"

For a more in-context view, you can clone our repo, or browse it via gitweb.

Managing the sudo policies

In the trunk version of the OpenStackManager extension, I’ve added support for managing per-project sudo. Users must be a member of the sysadmin role to do so.

by Ryan Lane at April 24, 2012 12:58 AM

April 23, 2012

Zmanda

Next Steps with OpenStack Swift Advisor - Profiling and Optimization (with Load Balancer in the Mix)

In our last blog on building Swift storage clouds, we proposed the framework for the Swift Advisor - a technique that takes two of  the three constraints (Capacity, Performance, Cost) as  inputs,  and provides hardware recommendations as output - specifically count and configuration of systems for each type of node (storage and proxy) of  the Swift storage cloud (Swift Cloud). Plus, we also provided a subset of our initial results for the Sampling phase.

In this blog, we will continue the discussion on Swift Advisor, first focusing on the impact of the load balancer on the aggregate throughput of the cloud (we will  refer to it as “throughput”) and then provide a subset of outcomes for the profiling and optimization phases in our lab.

Load Balancer

The load balancer distributes the incoming API requests evenly across the proxy servers. As shown below, the load balancer sits in front of the proxy servers to forward the API requests to them and can be connected with any number of proxy servers.

load balancer

If a load balancer is used, it is the only entry point of the Swift Cloud and all user data goes through it. So it is a very important component to consider for user visible performance of your Swift Cloud. In case it is not properly provisioned, it will become a severe bottleneck that inhibits the scalability of the Swift Cloud.

At a high-level, there are two types of load balancers:

Software Load Balancer: Runs a software load balancing software (e.g. Pound, Nginx) or round robin DNS on a server to evenly distribute the requests among proxy servers. The server running the software load balancer usually requires powerful multi-core CPUs and extremely high network bandwidth.

Hardware Load Balancer: Leverages the network switch/firewall or dedicated hardware with capability of load balancing to assign the incoming data traffic to the proxy servers of Swift Cloud.

Regardless of whether a software or hardware load balancer is used, the throughput of the Swift cloud cannot scale beyond the bandwidth of the load balancer. Therefore, we advise the cloud builders to deploy a powerful load balancer (e.g. with 10 Gigabit Ethernet) so that its “effective” bandwidth  exceeds the expected throughput of the Swift cloud.  We recommend that you pick your load balancer so that with a fully loaded (i.e. 100% busy) Swift Cloud, the load balancer still has around 50% unused capacity for future planning or sudden needs of higher bandwidth.

To have a sense of how to properly provision the load balancer and how it impacts the throughput of Swift Cloud, we show some results of running the Swift Cloud of c proxy and cN storage server (c:cN Swift Cloud) with the load balancer. (N is the “magic” value for 1:N Swift Cloud found in Sampling phase). These results are the “performance curves” for the profiling phase and can be directed used for optimizing your goal.

The experiments

In our last article, we already used some running examples to show how to get the output results from the Sampling phase. Here, we directly use the outputs (1:N swift cloud) of sampling phase as the inputs of the profiling phase, as seen below,

  • 1 Large Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 XL Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 CPU XL Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 Quad Instance based proxy node: 5 Medium Instance based storage nodes (N=5)

Based on the above 1:5 swift clouds, we profile the throughput curves of c:c5 Swift cloud (c = 2, 4, 6,…) with the following setups of load balancer:

  1. Using one “Cluster Compute Eight Extra Large Instance” (Eight) with  Pound (a reverse proxy, load balancer) as the software load balancer (”1 Eight”), that all proxy nodes are connected to. (Eight Instance is one-level more powerful than Quad Instance. Similar to the Quad Instance, it also equips 10Gigabit Ethernet, but has 2X amount of CPU resources, 2 x Intel Xeon ES-2670, eight-core “Sandy Bridge” architecture, and 2X of memory.)
  2. Using two identical Eight Instances (each runs with Pound) as the load balancers (”2 Eight”). 50% proxy nodes are connected to the first Eight Instance and another 50% proxy nodes are linked to the second Eight Instance. The storage nodes have no sense of the first and second half of proxy nodes and accept all data from all of the proxy nodes.

Again, we use Amanda Enterprise as our application to backup a 20GB data file to the c:c5 Swift Cloud. We concurrently run two Amanda Enterprise servers on two EC2 Quad instances to send data to the c:c5 Swift cloud, ensuring that two Amanda Enterprise servers can fully load the c:c5 Swift cloud in all cases.

For this experiment, we focus on the backup operations, so the aggregate throughput of backup operations is simply regarded as “throughput” (MB/s) measured between the two Amanda Enterprise servers and the c:c5 Swift cloud.

Let’s first look at the throughput curves (throughput on Y-axis, values of c on X-axis) of c:c5 Swift cloud with the two types of load balancers for each of above mentioned configurations of proxy and storage nodes.

(1) Proxy nodes run on the Large instance and the storage nodes run on the Small instance. The two curves are for the two types of load balancers (LB):

Proxy nodes run on the Large instance

(2) Proxy nodes run on the XL instance and the storage nodes run on the Small instance.

Proxy nodes run on the XL instance

(3) Proxy nodes run on the CPU XL instance and the storage nodes run on the Small instance.

Proxy nodes run on the CPU XL instance

(4) Proxy nodes run on the Quad instance and the storage nodes run on the Medium instance.

Proxy nodes run on the Quad instance

From the above 4 figures, we can see that throughput of c:c5 Swift cloud using 1 Eight instance as the load balancer can not scale beyond 140MB/s. While, with 2 Eight instances as the load balancer, the c:c5 Swift Cloud can scale in linear shape (for the values of “c” we tested with).

Next, we combine the above results of “2 Eight” load balancer  into one picture, and look at it from another point of view –  throughput on Y-axis, cost ($) on X-axis. (As you may recall from our last blog, the cost is defined as the EC2 usage cost of running c:c5 swift cloud for 30 days.)

load balancer  into one picture

The above graph tells us several things:

(1) The configuration of using CPU XL instances for proxy nodes and Small instances for Storage node is not a good choice, because when compared with configuration of using XL instances for proxy nodes and Small instances for Storage node, it consumes similar cost, but delivers lower throughput. The reason for this is our observation that XL instances provide better bandwidth than CPU XL instances. AWS marks the I/O performance (including the network bandwidth) of  both XL instance and CPU XL instance as “High”. From our pure network bandwidth testing, XL instance shows maximum 120 MB/s for both incoming and outgoing bandwidth, while CPU XL instance has maximum 100 MB/s for both incoming and outgoing bandwidth.

(2) The configuration of using Large instances on proxy nodes and Small instances on Storage node is the most cost-effective. Since within each throughput group (marked as dotted circle in the figure): low, medium and high, it achieves the similar throughput, but with much lesser cost. The reason  this configuration can be cost-effective is because Large instance can provide the maximum 100 MB/s for both incoming and outgoing network bandwidth, which is similar to the XL and CPU XL instances, but is associated with 2x lower cost than the XL and CPU XL instances.

(3) While using Large instances on proxy nodes and Small instances on Storage node is very cost-effective, but the configuration of using Quad instances on proxy nodes and Medium instances on Storage node is also an attractive option. Especially if you consider the manageability and failure issues. To achieve 175MB/s througput, you can choose either 8 Large instance based proxy nodes and 40 Small instance based storage nodes (total 48 nodes), or 4 Quad instance based proxy nodes and 20 Medium instance based storage nodes (total 24 nodes). Hosting and managing more nodes in the data center may require higher IT-related costs, e.g. power, # of server racks, failure rate and IT administration. Considering those costs, it may be more attractive to setup a Swift Cloud with smaller number of more powerful nodes.

Based on the data in the above figure and considering the IT-related costs, the goal of the optimization phase is to choose the configuration that optimizes your goal best. For example, if you input the performance and capacity constraints and want to minimize the cost, let’s suppose the two configuration: (1) using Large instances for proxy nodes and Small instances for Storage nodes, and (2) using Quad instances for proxy nodes and Medium instances for Storage nodes, can both satisfy your capacity constraint. Now, the only thing left is that you want to figure out which configuration has less cost to fulfill the throughput constraint. The final result depends on your IT management costs. If your IT management cost is relatively expensive, then you may want to choose second configuration, otherwise, the first configuration will likely incur lesser cost.

In the future articles, we will talk about how to map the EC2 instances to the physical hardware so that the cloud builders can build an optimized Swift cloud running on physical servers.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com

by ning at April 23, 2012 07:00 AM

April 22, 2012

John Dickinson

Swift Tech Overview

Openstack Object storage, called swift, is a distributed, fault-tolerant, eventually consistent object storage system. In this post, I’d like to go in to some detail about what that means.

Distributed

Swift is a distributed system. It is designed to be run on a cluster of computers rather than on a single machine. Swift is composed of three major parts: the proxy, storage servers, and consistency servers.

Proxy

The proxy server is a server process that provides the swift API. As the only system in the swift cluster that communicates with clients, the proxy is responsible for coordinating with the storage servers and replying to the client with appropriate messages. The proxy is an HTTP server that implements swift’s REST-ful API. All messages to and from the proxy use standard HTTP verbs and response codes. This allows developers building clients to interact with swift in a simple, familiar way.

Swift provides data durability by writing multiple complete replicas of the data stored in the system. The proxy is what coordinates the read and write requests from clients and implements the read and write guarantees of the system. When a client sends a write request, the proxy ensures that the object has been successfully written to disk on the storage nodes before responding with a code indicating success.

Storage Servers

The swift storage servers provide the on-disk storage for the cluster. There are three types of storage servers in swift: account, container, and object. Each of these servers provide an internal REST-ful API. The account and container servers provide namespace partitioning and listing functionality. They are implemented as sqlite databases on disk, and like all entities in swift, they are replicated to multiple availability zones within the swift cluster.

Swift is designed for multi-tenancy. Users are generally given access to a single swift account within a cluster, and they have complete control over that unique namespace. The account server implements this functionality. Users can set metadata on their account, and swift aggregates usage information here. Additionally, the account server provides a listing of the containers within an account.

Swift users may segment their namespace into individual containers. Although containers cannot be nested, they are conceptually similar to directories or folders in a file system. Like accounts, users may set metadata on individual containers, and containers provide a listing of each object they contain. There is no limit to the number of containers that a user may create within a swift account, and the containers do not have globally-unique naming requirements.

Object servers provide the on-disk storage for objects stored within swift. Each object in swift is stored as a single file on disk, and object metadata is stored in the file’s extended attributes. This simple design allows the object’s data and metadata to be stored together and replicated as a single unit.

Consistency Servers

Storing data on disk and providing a REST-ful API to it is not a hard problem to solve. The hard part is handling failures. Swift’s consistency servers are responsible for finding and correcting errors caused by both data corruption and hardware failures.

Auditors run in the background on every swift server and continually scan the disks to ensure that the data stored on disk has not suffered any bit-rot or file system corruption. If an error is found, the corrupted object is moved to a quarantine area, and replication is responsible for replacing the data with a known good copy.

Updaters ensure that account and container listings are correct. The object updater is responsible for keeping the object listings in the containers correct, and the container updaters are responsible for keeping the account listings up-to-date. Additionally, the object updater updates the object count and bytes used in the container metadata, and the container updater updates the object count, container count, and bytes used in the account metadata.

Replicators ensure that the data stored in the cluster is where is should be and that enough copies of the data exist in the system. Generally, the replicators are responsible for repairing any corruption or degraded durability in the cluster.

Fault-tolerant

The combination of swift’s pieces allows a swift cluster to be highly fault-tolerant. Swift implements the concept of availability zones within a single geographic region, and data can be written to hand-off nodes if primary nodes are not available. This allows swift to survive hardware failures up to and including the loss of an entire availability zone with no impact to the end-user.

An interesting consequence of this design is that upgrades and cluster resizes can be easily performed on a production cluster with zero end-user downtime. Swift provides both forward and backwards compatibility of its API, so a swift cluster can be running multiple versions of the swift software at the same time, as is common while the software is being upgraded. Similarly, during resizes, the incongruent data about where data lives is simply seen as a failure. Processes like replication ensure that the data will be moved to its correct location.

Eventually Consistent

Swift achieves high scalability by relaxing constraints on consistency. While swift provides read-your-writes consistency for new objects, listings and aggregate metadata (like usage information) may not be immediately accurate. Similarly, reading an object that has been overwritten with new data may return an older version of the object data. However, swift provides the ability for the client to request the most up-to-date version at the cost of request latency.

Example Request Flow

When an object PUT request is made to swift, the proxy server determines the correct storage nodes responsible for the data (based on a hash of the object name) and sends the object data to those object servers concurrently. If one of the primary storage nodes is unavailable, the proxy will choose an appropriate hand-off node to write data to. If a majority of the object servers respond with a success, then the proxy returns success to the client.

Similarly, when an object GET request is made, the proxy determines which three storage nodes have the data and then requests the data from each node in turn. The proxy will return the object data from the first storage node to respond successfully.

Client Data Designs

Using any storage system effectively means understanding the characteristics of the system and the guarantees that the system provides. Swift is optimized for high concurrency rather than single-stream throughput. The aggregate throughput of a swift cluster is much higher than what is available for a single request stream. A swift client can take advantage of this by distributing data across multiple containers within an account. For example, backups may be stored by day or week in a container that includes that information in its name. Or a photo-sharing application may store images across many containers by using a prefix of the hash of the photo in the container names.

Summary

Swift’s design provides robust software that can run effectively on unreliable (read: cheap) hardware. Modular processes allow deployers to optimize clusters based on client use cases. Fault-tolerance allows clusters to be effectively managed by a limited operations staff.

Swift is production-ready code that has been running at scale powering Rackspace Cloud Files for two years. It is being deployed around the world at large and small scale by public cloud service providers and for private, internal needs. Swift is 100% open source released under the Apache 2.0 license. For more information, you can read the technical docs, the admin guide, or the API guide. To get started building applications for swift, you can use either the stand-alone Python module included in swift’s code or any of Rackspace’s Cloud Files language bindings. If you have further questions, ask on the Openstack mailing list or in #openstack on freenode.

April 22, 2012 12:00 AM

April 21, 2012

OpenStack Blog

OpenStack Conference Spring 2012 Day 2

And today marks the end of this full week of meetings, panels, formal and informal chats, parties, networking and lots of fun. The final count of registered people went well over 1,000! This calls for a celebration to this great community.

<script src="http://storify.com/smaffulli/openstack-conference-spring-2012-day-2.js?template=slideshow"></script><noscript>[View the story "OpenStack Conference Spring 2012 Day 2" on Storify]</noscript>

Remember to publish your slides on Slideshare OpenStack Group and stay tuned for the videos of the conference. Thank you all for participating, see you all soon.

Upcoming Events

by Stefano Maffulli at April 21, 2012 12:13 AM

April 20, 2012

OpenStack Blog

OpenStack Conference Spring 2012 Day 1

The OpenStack Conference started with incredible keynotes this morning. All sessions are recorded and videos will be published soon. Meanwhile, the slides provided by the speakers are shared on SlideShare OpenStack group.

Highlights of first day

pic.twitter.com/xZ0PjHhw

packed room for opening session at #openstack conference with @jbryce

Owly Images

Radio Free Asia brings freedom of press to closed societies using #OpenStack

Owly Images

Mark Interrante and John Engates live demoing Rackspace Cloud Servers powered by #OpenStack

Owly Images

Live deployment of #OpenStack by Mark Shuttleworth on stage

Owly Images

Kurt Garloff, VP Engineering at DBU Cloud Services in Deutsche Telekom talking about his Linux experience and parallels

Owly Images

Biri Singh explains HP Cloud Services powered by #OpenStack

Owly Images

Chris Kemp on stage thanking leaders of #OpenStack and recognizing ecosystem

 

“Vish Ashaya hosting a panel of block storage experts at #openstack including #Ceph's Tommi Vaartinen”

Party time now!

 

by Stefano Maffulli at April 20, 2012 12:30 AM

April 18, 2012

OpenStack Blog

Folsom Design Summit Day 3

The Folsom Design Summit has ended. Tonight’s party marks the beginning of the OpenStack Spring 2012 Conference

Highlights of the day

See what attendees think of the OpenStack Design Summit

<iframe frameborder="0" height="315" src="http://www.youtube.com/embed/NONFTut7Y6A" width="560"></iframe>

chaos monkey strolled through the #OpenStack dev lounge...

by Stefano Maffulli at April 18, 2012 11:59 PM

Zmanda

OpenStack Swift Advisor: Building Cloud Storage with Optimized Capacity, Cost and Performance

OpenStack Swift is an open source cloud storage platform, which can be used to build massively scalable and highly robust storage clouds. There are two key use cases of Swift:

  • A service provider offering cloud storage with a well defined RESTful HTTP API - i.e. a Public Storage Cloud. An ecosystem of applications integrated with that API are offered to the service provider’s customers. Service provider may also choose to only offer a select service (e.g. Cloud Backup) and not offer access to the API directly.
  • A large enterprise building a cloud storage platform for use for internal applications - i.e. a Private Storage Cloud. The organization may do this because it is reluctant to send its data to a third party public cloud provider or to build a cloud storage platform which is closer to the users of its applications.

In both of above cases, as you plan to build your cloud storage infrastructure, you will face one of these three problems:

  1. Optimize my cost: You know how much usable storage capacity you need from your cloud storage, and you know how much aggregate throughput you need for applications using the cloud storage, but you want to know what is the least amount of budget you need to be able to achieve your capacity and throughput goals.
  2. Optimize my capacity: You know how much aggregate throughput you need for applications using the cloud storage, and you know your budget constraints, but you want to know the maximum capacity you can get for your throughput needs and budget constraints.
  3. Optimize my performance: You know how much usable storage capacity you need from your cloud storage, and you know your budget constraints, but you need to know the configuration to get best aggregate throughput for your capacity and budget constraints.

Solving any of the three problems above is very complex because of the myriad choices that the cloud storage builder has to make, e.g. size and number of various types of servers, network connectivity, SLAs etc. We have done extensive work in our labs and with several cloud providers to understand above problems and to address them with rigorous analysis. In this series of blogs we will provide some of the results of our findings as well as description of tools and services which can help you to build, deploy and maintain your storage cloud with confidence.

Definitions Since the terms used can be interpreted differently depending on context, below are the specific definitions used in this series of blogs for the three key parameters:

Capacity: It is the usable storage capacity, i.e. the size of the maximum application data that can be stored on the cloud storage. Usually, for better availability and durability, the data is replicated in the cloud storage across multiple systems.  So, the the raw capacity of the cloud storage should be planned with the consideration of data redundancy. For example, in OpenStack Swift,  each object is replicated three times by default. So, the total size of raw storage will be at least three times larger than the usable storage capacity.

Performance: It is the maximum aggregate throughput (MB/s or GB/s) that can be achieved by applications from the cloud storage. In this blog, we will also use the term throughput to denote aggregate throughput.

Cost: For this discussion we will only consider the initial purchase cost of the hardware for building the cloud storage. We expect that the built cloud storage will be put to use for several years, but we are not amortizing the cost over a period of time.  We will point out best practices to reduce on-going maintenance and scaling costs. For this series of blogs we will use the terms “node” and “server” interchangeably. So, “storage node” is same as “storage server”.

Introducing the framework for the Swift Advisor

The Swift Advisor is a technique that takes two of  the three constraints (Capacity, Performance, Cost) as  inputs,  and provides hardware recommendation as output, specifically count and configuration of systems for each type of node (storage and proxy) of  the Swift storage cloud. This recommendation is optimized for the third constraint: e.g. minimize  your budget, maximize your throughput, or maximize your usable storage capacity.

Before discussing the technical details of the Swift Advisor, let’s first look at a practical way to use the Swift Advisor: In order to build an optimized Swift cloud storage (Swift Cloud), an important feature of Swift Advisor is to consider a very large range of hardware configurations (e.g. a wide variety of CPU, memory, disk and network choices). However, it is unrealistic and very expensive to blindly purchase a large amount of physical hardware upfront and let Swift Advisor evaluate their individual performances as well as the overall performance after putting them together. Therefore, we choose to leverage virtualized and elastic environment offered by Amazon EC2 and build an optimized Swift Cloud on the EC2 instances initially.

While it may seem ironical that we are using a public compute cloud to come up with an optimized private storage cloud, the reasons for choosing EC2 as the test-bed for Swift Advisor are multi-fold: (1) EC2 provides many types of EC2 instances with different capacities of CPU, memory and I/O to meet the various needs. So, the Swift Advisor can try out many types of EC2 instances on the basis of pay-per-use, instead of physically owning the wide variety of hardware needed. (2) EC2 has a well defined pricing structure.

This provides a good comparison point for the cloud storage builders - they can look at the pricing information and justify the cost of owning their own cloud storage in the long run. (3) Specification of each type of EC2 instance, including CPU, memory, disk and network  is well defined. Once an optimized Swift Cloud is built on the EC2 instances with the input constraints, the specifications of those EC2 instances can effectively guide  the purchases of physical servers to build a Swift Cloud running on the physical hardware. In summary, you can use the elasticity of a compute cloud along with Swift Advisor to get specifications for your physical hardware based storage cloud, while preserving your desired constraints.

The high-level workflow of the Swift Advisor is shown below: The high-level workflow of the Swift Advisor There are four important phases and we explain them as follows:

Sampling Phase: Our eventual goal is to build an optimized Swift cloud consisting of quantity A of proxy servers and quantity B of storage severs  - A and B are unknown initially and we denote it as A:B Swift Cloud. In this first phase we focus on performance and cost characteristics of 1:N Swift Cloud. We look for the “magic” value of N that makes a 1:N Swift Cloud with the lowest cost per throughput ($ per MB/s) . The reason why we want to find a 1:N Swift cloud with the lowest $ per MB/s is to remove two potential pitfalls when building a Swift cloud : (1) Under-provisioning: the proxy server is under utilized and can still be attached to more storage servers to improve the throughput. (2) Over-provisioning: the proxy server has been overwhelmed by too many storage servers.

Since the potential combinatorial space for storage and proxy node choices is potentially huge, we use several heuristics to prune the candidates during various phases of the Swift Advisor. For example we do not consider very low powered configuration (e.g. Micro Instances) for proxy nodes.

After the sampling phase, for each combination of EC2 instance sizes on proxy and storage servers, we know the “magic” value of N that produces the lowest $ per MB/s of running a 1:N Swift cloud. You can run the sampling phase on any available virtual or physical hardware, but the larger the sample set the better.

Profiling Phase: Given the “magic” values of N from the sampling phase, our goal in this phase is to profile throughput curves (the throughput verses the size of Swift cloud) of several Swift clouds consisting of c proxy server and cN storage servers (c:cN Swift Cloud) with various values of c.

Please note that each throughput curve corresponds to each combination of hardware configuration (EC2 instance sizes in our case) of the proxy and storage servers. In our experiments, for each combination of EC2 instance sizes of the proxy and storage servers, the profiling starts from 2:2N Swift Cloud and we double the number of proxy and storage servers each time. (e.g. 4:4N, 8:8*N, ….). All cN EC2 instances for storage nodes are identical.

The profiling stops when the throughput of c:cN Swift Cloud is larger than the throughput constraint. After that, we apply a non-linear or linear regression on the profiled throughputs to plot a throughput curve with the X-values of c and Y-values of the throughput. The output of the profiling phase is a set of throughput curves of c:cN Swift Cloud, where each curve corresponds to a combination of EC2 instance sizes of the proxy and storage servers.

Optimization Phase: By taking the throughput curves from the profiling phase and two input constraints, the optimization phase is where we figure out a Swift Cloud optimized for the third parameter. We do this by plotting constraints on each throughput curve and look for the optimized value across all curves.

For example, lets say we are trying to optimize capacity with maximum budget given and minimum throughput requirement:  we will input the minimum required throughput on each throughput curve and find the corresponding values of c, and then reject the throughput curves where the implied hardware cost is more than the budget. Out of the remaining curves we will select the one resulting in maximum capacity based on cN * storage capacity of the system used for storage server.

Validation and Refinement Phase: The validation phase checks if the optimized Swift cloud really conforms to the throughput constraint through a test run of the workloads. If the test run fails a constraint, then the Swift Advisor goes to the refinement phase. The refinement phase gets the average throughput measured from the test run and sends it to the profiling phase.

The profiling phase adds that information to the profiled data to refine the throughput curves. After that, we use the refined throughput curves as the inputs to redo the optimization phase. The above four phases consists of the core of Swift Advisor. However, there are some important remaining issues to be discussed:

(1) choice of the load balancer

(2) mapping between the EC2 instance and the physical hardware when the cloud operators finally want to move the optimized Swift Cloud to physical servers, while preserving the three constraints on the new hosting hardware.

(3) SLA constraints. We will address these and other issues in building an optimized storage cloud for your needs in our future blogs.

Some Sampling Observations

In this blog, we present some of the results based on running Sampling phase on a selected configuration of systems. In future blogs, we will post the results for Profiling phase and Optimization phase.

For our sampling phase, we assume the following potential servers are available to us for proxy node: EC2 Large (Large), EC2 Extra Large (XL), EC2 Extra Large CPU-high (CPU XL) and EC2 Quadruple Extra Large (Quad). While the candidates for storage node are: EC2 Micro (Micro), EC2 Small (Small) and EC2 Medium (Medium).

Therefore, the total number of combinations of  proxy and storage nodes is 4 * 3 =12 and we need to find the “magic” value of N that produces the lowest $ per MB/s of running a 1:N Swift cloud for each combination. We start the sampling for each combination from N=5, and increase it until the throughput of 1:N Swift Cloud stops increasing. Note that a production Swift Cloud implementation requires at least 5 storage nodes. This happens when the proxy node is fully loaded and adding more storage nodes can not improve the throughput anymore.

We use Amanda Enterprise as our application to backup a 10G data file to the 1:N Swift cloud. The Amanda Enterprise runs on an EC2 Quad instance to ensure that one Amanda Enterprise server can fully load the 1:N Swift cloud in all cases. For this analysis we are assuming that the cloud builder is building the cloud storage optimized for backup operations. The user of the Swift Advisor should change the test workload based on the desired mix of expected application workload when the cloud storage goes production. We first look at the throughput for different values of N on each combination of EC2 instance sizes on proxy and storage nodes.

(1) Proxy node runs on EC2 Large instance and the three curves are for the three different sizes for the storage node:

Proxy node runs on EC2 Large instance

Observations with EC2 Large Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 30
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 10
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 5

(2) Proxy node runs on EC2 XL instance: Proxy node runs on EC2 XL instance

Observations with EC2 XL Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 30
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 10
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 5

(3) Proxy node runs on EC2 CPU XL instance: Proxy node runs on EC2 CPU XL instance

Observations with EC2 CPU XL Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 30
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 10
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 5

(4) Proxy node runs on EC2 Quad instance: Proxy node runs on EC2 Quad instance

Observations with EC2 Quad Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 60
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 20
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 10

Looking at above graphs, we can already draw some conclusions: E.g. if the only storage nodes available to you were equivalent to EC2 Micro Instance and you wanted your storage cloud to be able to scale beyond 30 storage nodes (per proxy node), you should pick at least EC2 Quad Instance equivalent proxy node. Let’s look at the figures (1) - (4) from another view: fix the EC2 instance size of storage node and vary the EC2 instance size of proxy node

(5) Storage node runs on EC2 Micro instance and the four curves are for the four different EC2 instance sizes on proxy node: Observations with EC2 Micro Instance based Storage Node:

  1. Large Instance based Proxy nodes: Throughput stops increasing at # storage node = 30
  2. XL Instance based Proxy nodes: Throughput stops increasing at # storage node = 30
  3. CPU XL Instance based Proxy nodes: Throughput stops increasing at # storage node = 30
  4. Quad Instance based Proxy nodes: Throughput stops increasing at # storage node = 60

From the above graphs, we can conclude that, (a) when proxy node runs on the Quad instance, it has the capability, especially the network bandwidth, that can accommodate more storage nodes and achieve higher throughput (MB/s) than using other instances for the proxy node.  (b) Different EC2 instance sizes on storage node load the same proxy node at different speed: for example, when proxy node runs on the Quad instance, we need to use 60 Micro instances as storage nodes to fully load the proxy node.

While, if we use Small or Medium instance size on storage node, we only need 10 storage nodes to fully load the proxy node. Based on the above results on throughput, now we look at the $ per throughput (MB/s) for different values of N on each combination of EC2 instance sizes on proxy and storage nodes. Here, $ is defined as the EC2 usage cost of running 1:N Swift cloud for 30 days. In this blog we are only showing numbers with proxy node set to EC2 Quad Instance. We will publish numbers for other combinations in another detailed report.

(6) Proxy node runs on EC2 CPU Quad instance: Proxy node runs on EC2 CPU Quad instance Observations with EC2 Quad Instance based Proxy Node:

  1. Micro Instance based Storage nodes: The lowest $ per MB/s is achieved at # storage node = 60
  2. Small Instance based Storage nodes: The lowest $ per MB/s is achieved at # storage node = 15
  3. Medium Instance based Storage nodes: The lowest $ per MB/s is achieved at # storage node = 5

Overall, the lowest $ per MB/s in the above figure  is achieved by using Medium Instance based Storage nodes at # storage node = 5 This specific result will provide input to the profiling phase of N=5, 15 and 60 for proxy/storage node combination EC2 Quad/Medium, EC2 Quad/Small and EC2 Quad/Micro respectively.

So, one can conclude that when using 1 Quad Instance based Proxy node it may be better to use 5 Medium based Storage nodes to achieve the lowest $ per MB/s, rather than using more Micro Instance based storage nodes. Above graphs are a small subset of the overall performance numbers achieved during the Sampling phase.

The overall objective here is to give you a summary of our recommended approach to building an optimized Swift Cloud. As mentioned above, we will publishing detailed results in another report, as more conclusions and best practices in future blogs in this series.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com

by ck at April 18, 2012 05:49 PM

OpenStack Blog

Folsom Design Summit Day 2

Highlights of the day:

and an announcement: tomorrow’s Cloud Foundry BOSH + Openstack hackathon; 6pm SF Hyatt bit.ly/HVYqFq (food+beer+code included)

Nicira's Dan Wendlandt introductory talk on #OpenStack Quantum is packed!

Darrell giving his talk on statsd + Swift at the #OpenStack Design Summit.

and to finish this great second day

"little" #openstack HP event at the Ferry Building

by Stefano Maffulli at April 18, 2012 03:00 AM

April 17, 2012

OpenStack Blog

Folsom Design Summit Day 1

Over 400 participant, more than 50 sessions today (over 150 in total). Not just developers in the rooms I’ve seen lots of users  involved in the sessions, asking questions and giving suggestions. Many discussisions revolve around real life problems and provide extremely  concrete solutions. There is the feeling that the OpenStack community is maturing.

Full house for the kickoff

Vish leading the Nova volume session

For some of the participants it was a long car trip

But the view is like a commercial ... Priceless

by Stefano Maffulli at April 17, 2012 12:21 AM

April 16, 2012

OpenStack Blog

Live Streaming From OpenStack Folsom Design Summit

Thanks to hardware contributed by Zareason and service provided by Cisco Webex, we’ll be able to offer an experimental live streaming of the discussions at Folsom Design Summit, starting April 16.

If you want to partecipate to the discussions remotely, check on the schedule which room is hosting the session you’re interested in, then join the Webex meeting to listen to the discussion and use IRC to ask questions.

Each room has its own dedicated Webex meeting (the password for all events is folsom) and its own dedicated IRC channel:

Please keep in mind that this is an experimental feature and it may break any time.  Follow @openstack on Twitter for updated information. Thanks to all the volunteers that helped on Sunday afternoon: Matt, Duncan and Nirmal.

by Stefano Maffulli at April 16, 2012 05:53 PM

April 13, 2012

OpenStack Blog

Community Weekly Review (Apr 6 – 13)

OpenStack Community Newsletter – April 13, 2012

HIGHLIGHTS

IMPORTANT LINKS FOR THE SUMMIT AND CONFERENCE

OTHER NEWS

COMMUNITY STATISTICS

This week’s statistics are contributed by Bitergia, a startup company expert in analysing open source communities. They developed for us a tool to analyse the history of OpenStack bug reports hosted on Launchpad. The charts below represent data taken from the projects: nova, glance, swift, horizon, keystone, manuals, quantum, tempest, python-keystoneclient, python-novaclient, python-quantumclient

Monthly active authors in OpenStack Total number of authors of OpenStack History of issues opened vs closed

This weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at April 13, 2012 07:48 PM

SwiftStack Team

SwiftStack is bringing it at the 2012 OpenStack Design Summit and Conference

Where does the time go? It seems like yesterday I was bemoaning the fact that this year's OpenStack Design Summit was months away. Now it's nearly here! Hope you got a ticket, because the Design Summit portion is sold out.

SwiftStack is glad to announce that we will be presenting three sessions at the Summit and Conference this year. I'll summarize them here just to whet your appetite:

7 Steps To Heaven

Joe Arnold
Thursday April 19, 2012 3:00pm - 3:40pm
Breakout B (Bayview Level, Hyatt Regency Hotel)

If you’re reading this, then you’ve probably at least considered using OpenStack Swift at your place of enterprise. But by now you probably know it's not as easy as double-clicking on an icon or visiting an App Store. What does it actually take to make it happen? Our fearless leader Joe Arnold will be taking on the greater challenges surrounding making your Swift dream a reality, including:

  • Convincing management that Swift is the right choice...when it is the right choice
  • How to configure Swift clusters in a way that you won’t regret in the morning
  • How to leverage Swift to build effective applications, making it all worthwhile

Monitoring Swift with StatsD

Darrell Bishop
Tuesday April 17, 2012 3:30pm - 3:55pm
Marina Room

Darrell will be giving a 25-minute presentation on Swift cluster monitoring using StatsD. In it, he will make an argument for deeply integrated monitoring, contrast this with existing approaches, and outline his approach to implementation, which he has submitted to Swift and which is currently under review. This is a must-attend for people interested in running production Swift clusters. Darrell has written a fascinating and comprehensive post on this very blog which surveys the Swift monitoring landscape.

Swift Workshop

Joe Arnold, Darrell Bishop, and Orion Auld
Friday April 20, 2012 1:30pm - 4:00pm
Breakout B (Bayview Level, Hyatt Regency Hotel)

After a brief discussion of Swift’s basic architecture, Joe and Darrell will lead attendees through the steps necessary to install Swift on their laptop computers. They’ll show how to use VirtualBox to run VMs that will act as Swift nodes. Then attendees will learn the ins and outs of Swift setup: Swift daemon configuration, Builder file and Ring creation, and Swift startup.

Once attendees have a cluster up and running, they’ll learn how to use it. Attendees will use curl and the swift command-line tool to authenticate themselves, put data into the cluster, and remove it therefrom. In the process, they will observe core swift operations (like replication) in action.

Finally, once attendees have a working Swift cluster up and running, I will be taking some time to answer the question:
what good is it? I will walk users through the steps required to get a simple application up and running that uses Swift as its backbone. We will discuss some of ways that Swift can form an effective part of any major enterprise, and some of the techniques that can be employed to create useful Swift-based apps.

We'll see you at the conference!

April 13, 2012 01:21 PM

Martin Loschwitz

Presenting OpenStack at the LinuxTag 2012 in Berlin, Germany

I am happy and proud to announce that my proposal for a presentation of OpenStack at the LinuxTag 2012 in Berlin, Germany, was accepted by the organizational committee! I'll have the opportunity to present the OpenStack cloud solution on May 25, 2012 from 16:30 to 17:15 in the room "Europa II". The presentation will be held in German.

read more

by martin at April 13, 2012 12:49 PM

April 12, 2012

OpenStack Blog

OpenStack Foundation Update

We’ve come a long way since announcing in October in Boston that we planned to create an OpenStack Foundation in 2012. We set out to solicit as much feedback as possible (and on the second day, there was a town hall) and to participate in as many open forums as possible (mailing lists, webinars, meetups, and in person meetings).

At the start of the process, Jonathan Bryce and I spent the first couple of months learning as much as we could about successful open source foundations, like the ASF, Eclipse, and the Linux Foundation, reading foundation meeting minutes into the wee hours of the morning. We also talked to a lot of lawyers to get advice on legal structures, and received feedback from many of the folks in the community. We entered 2012 with a heck of a lot more knowledge and a good sense of what proposals to put forward (borrowing heavily from these amazing trailblazers before us!) that would best fit the “OpenStack Way.”

Since then we’ve had active mailing list discussions, held several webinars, meetups, and published a ton of stuff on the wiki, culminating in a Mission, Structure, and Funding Model that stay true to our values as a community, including:

  • An open development process that is driven by technical meritocracy
  • Making significant investments in community building and driving awareness and adoption
  • Encouraging the development of a healthy and profitable ecosystem of companies powered by OpenStack

Last month, as that framework started coming into focus, we published a “Framework Acknowledgement Letter” and asked companies to sign it if they agreed with the approach and were intent on joining as Gold or Platinum members once formed. Today are are very excited to announce that nineteen companies have signed the letter:

  • Platinum: AT&T, Canonical, HP, IBM, Nebula, Rackspace, Red Hat, SUSE
  • Gold: Cisco, ClearPath, Cloudscaling, Dell, DreamHost, ITRI, Mirantis, Morphlabs, NetApp, Piston Cloud Computing, Yahoo!

What’s Next?
We are now forming a Drafting Committee, to take the framework and turn it into legal documents, with the help of legal resources from the above companies. The Drafting Committee process and timeline is outlined on the wiki, and they will be publishing drafts for community review & input, with a goal of getting to a final draft in Q3. The committee will not be making decisions in a vacuum, they will be putting the framework into long form legalese for all of us to review and comment.

I don’t think it’s a stretch to suggest that cloud computing will one day power our global economy, and that means there is a lot at stake. Seeing the caliber of companies putting serious resources into making OpenStack successful, who all believe deeply in the open development model, I am more optimistic than ever it will be an open future, powered by OpenStack.

Here are a few related posts about today’s announcement from participating companies

In closing, I’ll leave you with the mission we are excited to pursue when the foundation is formed later this year:

The Foundation Mission: The OpenStack Foundation is an independent body providing shared resources to help achieve the OpenStack Mission by Protecting, Empowering, and Promoting OpenStack software and the community around it, including users, developers and the entire ecosystem.

Mark Collier

@sparkycollier

by Mark Collier at April 12, 2012 01:59 PM

Thierry Carrez

OpenStack Folsom Design Summit

In a few days the OpenStack developer community will gather in the heart of San Francisco for three days of brainstorming and discussions around the next release cycle of OpenStack projects, code-named “Folsom”.

The Design Summit is a key moment for our open innovation community. This is not a conference with speakers. This is not where a closed developer group announces to the public the changes they intend to push to their private “open source” project. We design, discuss and make decisions at the summit as a community. It’s quite uncommon, and that’s what makes us different.

Our (elected) PTLs have final say in case of unsolvable conflicts, but generally consensus is reached in those face-to-face meetings much more easily than on mailing-lists. That’s why this is a critical moment, and we need to make the best use of this short time together. Connect with other people interested to solve the same issues, avoid duplication of work, and collaborate with developers from all those different companies on making OpenStack awesome.

We have great brainstorming topics for those three days. Most tracks already have a tentative schedule posted at http://folsomdesignsummit2012.sched.org/, although it’s still subject to scheduling changes. If you have a new idea for a session, it’s too late to get in the official tracks, but we provide an Unconference room for talks that could not fit in the tracks, last-minute ideas and continuation of discussions. And since we like to talk about random stuff that matters to us, we will also have 5-min lightning talks every day after lunch.

Session leads should take the time to view Jim Plamondon’s training video, it’s a great introduction on how to make the most of a session you lead. I hope to meet all of you in person next week !

 


by Thierry Carrez at April 12, 2012 10:12 AM

April 11, 2012

SwiftStack Team

Monitoring Swift With StatsD

A Swift cluster is a complicated beast---a collection of many daemons across many nodes, all working together. With so many "moving parts" it's important to be able to tell what's going on inside the cluster. Tracking server-level metrics like CPU utilization, load, memory consumption, disk usage and utilization, etc. is necessary, but not sufficient. We need to know what the different daemons are doing on each server. What's the volume of object replication on node8? How long is it taking? Are there errors? If so, when did they happen?

In such a complex ecosystem, it's no surprise that there are multiple approaches to getting the answers to these kinds of questions. Let's examine some of the existing approaches to Swift monitoring and then discuss what we do here at SwiftStack.

Swift Recon

The Swift Recon middleware can provide general machine stats (load average, socket stats, /proc/meminfo contents, etc.) as well as Swift-specific metrics:

  • The MD5 sum of each ring file.
  • The most recent object replication time.
  • Count of each type of quarantined file: account, container, or object.
  • Count of "async_pendings" (deferred container updates) on disk.

Swift Recon is middleware installed in the object server's pipeline and takes one required option: a local cache directory. Tracking of async_pendings requires an additional cron job per object server. Data is then accessed by sending HTTP requests to the object server directly, or by using the swift-recon command-line tool.

There are some good Swift cluster stats in there, but the general server metrics overlap with existing server monitoring systems and to get the Swift-specific metrics into a monitoring system, they must be polled. Swift Recon is essentially acting as a middle-man metrics collector. The process actually feeding metrics to your stats system, like collectd, gmond, etc., is probably already running on the storage node. So it could either talk to Swift Recon or just collect the metrics itself.

There's an upcoming update to Swift Recon which broadens support to the account and container servers. The auditors, replicators, and updaters can also report statistics, but only for the most recent run.

At SwiftStack, we need to track many more aspects of the cluster's operation beyond what Swift Recon covers, which brings us to the next tool.

Swift-Informant

Florian Hines developed the Swift-Informant middleware to get real-time visibility into Swift client requests. It sits in the proxy server's pipeline and after each request to the proxy server, sends three metrics to a StatsD server:

  • A counter increment for a metric like obj.GET.200 or cont.PUT.404.
  • Timing data for a metric like acct.GET.200 or obj.GET.200. [The README says the metrics will look like duration.acct.GET.200, but I don't see the "duration" in the code. I'm not sure what Etsy's server does, but our StatsD server turns timing metrics into 5 derivative metrics with new segments appended, so it probably works as coded. The first metric above would turn into acct.GET.200.lower, acct.GET.200.upper, acct.GET.200.mean, acct.GET.200.upper_90, and acct.GET.200.count]
  • A counter increase by the bytes transferred for a metric like tfer.obj.PUT.201.

This is good for getting a feel for the quality of service clients are experiencing with the timing metrics, as well as getting a feel for the volume of the various permutations of request server type, command, and response code. Swift-Informant also requires no change to core Swift code since it is implemented as middleware. However, because of this, it gives you no insight into the workings of the cluster past the proxy server. If one storage node's responsiveness degrades for some reason, you'll only see that some of your requests are bad---either as high latency or error status codes. You won't know exactly why or where that request tried to go. Maybe the container server in question was on a good node, but the object server was was on a different, poorly-performing node.

So we need deeper visibility into the cluster's operation, behind the proxy servers.

Statsdlog

Florian's statsdlog project increments StatsD counters based on logged events. Like Swift-Informant, it is also non-intrusive, but statsdlog can track events from all Swift daemons, not just proxy-server. The daemon listens to a UDP stream of syslog messages and StatsD counters are incremented when a log line matches a regular expression. Metric names are mapped to regex match patterns in a JSON file, allowing flexible configuration of what metrics are extracted from the log stream.

Currently, only the first matching regex triggers a StatsD counter increment, and the counter is always incremented by 1. There's no way to increment a counter by more than one or send timing data to StatsD based on the log line content. The tool could be extended to handle more metrics per line and data extraction, including timing data. But even then, there would still be a coupling between the log textual format and the log parsing regexes, which would themselves be more complex in order to support multiple matches per line and data extraction. Also, log processing introduces a delay between the triggering event and sending the data to StatsD. We would prefer to increment error counters where they occur, send timing data as soon as it is known, avoid coupling between a log string and a parsing regex, and not introduce a time delay between events and sending data to StatsD. And that brings us to the next method of gathering Swift operational metrics.

Swift StatsD Logging

StatsD was designed for application code to be deeply instrumented; metrics are sent in real-time by the code which just noticed something or did something. The overhead of sending a metric is extremely low: a sendto of one UDP packet. If that overhead is still too high, the StatsD client library can send only a random portion of samples and StatsD will approximate the actual number when flushing metrics upstream.

To avoid the problems inherent with middleware-based monitoring and after-the-fact log processing, we integrated the sending of StatsD metrics into Swift itself. Our submitted change set currently reports 124 metrics across 15 swift daemons and the tempauth middleware. Details of the metrics tracked are in the Admin Guide.

The sending of metrics is integrated with the logging framework. To enable, configure log_statsd_host in the relevant config file. You can also specify the port and a default sample rate. The specified default sample rate is used unless a specific call to a statsd logging method (see the list below) overrides it. Currently, no logging calls override the sample rate, but it's conceivable that some metrics may require accuracy (sample_rate == 1) while others may not.

[DEFAULT]
...
log_statsd_host = 127.0.0.1
log_statsd_port = 8125
log_statsd_default_sample_rate = 1

Then the LogAdapter object returned by get_logger(), usually stored in self.logger, has the following new methods:

  • set_statsd_prefix(self, prefix) Sets the client library's stat prefix value which gets prepended to every metric. The default prefix is the "name" of the logger (eg. "object-server", "container-auditor", etc.). This is currently used to turn "proxy-server" into one of "proxy-server.Account", "proxy-server.Container", or "proxy-server.Object" as soon as the Controller object is determined and instantiated for the request.
  • update_stats(self, metric, amount, sample_rate=1) Increments the supplied metric by the given amount. This is used when you need to add or subtract more that one from a counter, like incrementing "suffix.hashes" by the number of computed hashes in the object replicator.
  • increment(self, metric, sample_rate=1) Increments the given counter metric by one.
  • decrement(self, metric, sample_rate=1) Lowers the given counter metric by one.
  • timing(self, metric, timing_ms, sample_rate=1) Record that the given metric took the supplied number of milliseconds.
  • timing_since(self, metric, orig_time, sample_rate=1) Convenience method to record a timing metric whose value is "now" minus an existing timestamp.

Note that these logging methods may safely be called anywhere you have a logger object. If StatsD logging has not been configured, the methods are no-ops. This avoids messy conditional logic each place a metric is recorded. Here's two example usages of the new logging methods:

``` python

swift/obj/replicator.py

def update(self, job):

# ...
begin = time.time()
try:
    hashed, local_hash = tpool.execute(tpooled_get_hashes, job['path'],
            do_listdir=(self.replication_count % 10) == 0,
            reclaim_age=self.reclaim_age)
    # See tpooled_get_hashes "Hack".
    if isinstance(hashed, BaseException):
        raise hashed
    self.suffix_hash += hashed
    self.logger.update_stats('suffix.hashes', hashed)
    # ...
finally:
    self.partition_times.append(time.time() - begin)
    self.logger.timing_since('partition.update.timing', begin)

```

``` python

swift/container/updater.py

def process_container(self, dbfile):

# ...
start_time = time.time()
# ...
    for event in events:
        if 200 <= event.wait() < 300:
            successes += 1
        else:
            failures += 1
    if successes > failures:
        self.logger.increment('successes')
        # ...
    else:
        self.logger.increment('failures')
        # ...
    # Only track timing data for attempted updates:
    self.logger.timing_since('timing', start_time)
else:
    self.logger.increment('no_changes')
    self.no_changes += 1

```

We wanted to use the pystatsd client library (not to be confused with a similar-looking project also hosted on GitHub), but the released version on PyPi was missing two desired features the latest version in GitHub had: the ability to configure a metrics prefix in the client object and a convenience method for sending timing data between "now" and a "start" timestamp you already have. So we just implemented a simple StatsD client library from scratch with the same interface. This has the nice fringe benefit of not introducing another external library dependency into Swift.

Final Thoughts

In this post we looked at a few ways to gather metrics for a Swift cluster. None of them are complete. Swift Recon doesn't look at enough and overlaps with general-purpose server monitoring packages. Swift-Informant only sees the world from a client's perspective. Statsdlog only handles counter incrementing and parses log messages after-the-fact versus incrementing counters directly when they occur. Our integrated StatsD logging doesn't gather "gauge" metrics like "how many async_pendings are there right now." Instead, we only increment a counter when something is quarantined or an async_pending is created. Those metrics naturally map to a gauge.

We feel a Swift cluster's operation is best tracked with the combination of a general-purpose server monitoring system, a mechanism for polling Swift-specific gauge metrics, and deep StatsD logging instrumentation in Swift for counter and timing metrics. For polling Swift-specific gauge metrics, we're probably best off using a plugin for a general-purpose collection system. That plugin could read data from swift-recon, or gather the Swift-specific metrics directly.

At SwiftStack, we use collectd plus some Python plugin code for server monitoring. We also embed a StatsD server in collectd so there's one process per node funneling stats back to a Graphite cluster. With this setup, we have all the aforementioned bases covered: general purpose monitoring, Swift-specific gauge monitoring, and real-time counter and timing data directly from Swift.

Once you have all this great data, what do you do with it? Well, that's going to require its own post. But in addition to the obvious, like graphing it, you can perform anomaly detection, trigger alerts, maintain a real-time view of entity health, avoid surprises with capacity forecasting, and more!

Creative Commons License
"Monitoring Swift With StatsD" by SwiftStack, Inc. is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

April 11, 2012 11:38 AM

Martin Loschwitz

Presenting OpenStack at the LinuxWochen 2012 in Vienna, Austria

I've got good news for all those that can't make it to my OpenStack presentation in Graz: As I was just told, my presentation for the LinuxWochen 2012 in Vienna was accepted. That means that I will be presenting the OpenStack cloud stack on May 3, 2012, in Austria's beautiful capital, Vienna. 

read more

by martin at April 11, 2012 07:24 AM

Zmanda

MySQL Backup Updated

As MySQL continues to grow (as a technology and as an ecosystem) the need and importance of creating and deploying robust MySQL backup solutions grows as well. In many circles Zmanda is known as “The MySQL Backup Company”. While we provide backup of a wide variety of environments, we gladly take the label of backing up the most popular open source database in the world, especially as we kick off our presence at the 2012 MySQL Conference.

Here are some of the updates to our MySQL backup technologies that we are announcing at the conference:

Announcing Zmanda Recovery Manager 3.4

We have updated the popular Zmanda Recovery Manager (ZRM) for MySQL product for scalability. Our customers continue to deploy ZRM to backup ever larger MySQL environments. Some of the scalability features include: Better support for hundreds of backup sets within one ZRM installation, support for more aggressive backup schedules, better support for site-wide templates, and deeper integration with NetApp’s snapshot mechanisms. We have also added support for the latest versions of XtraBackup and MySQL Enterprise Backup. We have also added experimental support for backing up Drizzle (via XtraBackup). If you are deploying Drizzle in your environment, we are looking for beta customers.

Many of our customers store their MySQL databases on NetApp storage. ZRM can be used in conjunction with NetApp Snapshot and SnapVault products to create database consistent backups without moving the data out of NetApp storage. ZRM creates snapshots of MySQL database volumes, which it can then move to another Netapp storage using Netapp SnapVault. SnapVault moves the data efficiently between various NetApp filers. This provides customers a way to protect the backups without impacting their corporate LAN. ZRM uses SnapRestore functionality to quickly restore the databases in case of a failure.

Announcing MySQL Backup Agent for Symantec NetBackup

If you have Symantec NetBackup deployed in your environment, and you would like to consolidate your MySQL backups within the umbrella of NetBackup based backup infrastructure, now you have a well integrated solution. We have released MySQL backup Agent, which is deeply integrated with Symantec NetBackup. This agent allows you do perform live backups of your MySQL databases directly from your MySQL servers to your NetBackup server.

NetBackup MySQL Agent


Backup of your MySQL databases to the Cloud

Public or Private Cloud Storage is a great choice for offsite store for backup archives. You can also use compute clouds as inexpensive DR site for your MySQL databases. For MySQL databases running on Windows, our Zmanda Cloud Backup product provides a very easy and inexpensive way to backup to Amazon S3 or Google Cloud Storage.

If you have MySQL databases running on Linux or heterogeneous environments, you have two choices for backing up to the cloud: You can use our Amanda Enterprise product with Amazon S3 or Google Cloud Storage option to move backup images created by ZRM to the cloud. Second option is to use the recently released Amazon Storage Gateway in conjunction with ZRM.

ZRM Backing Up To AWS Gateway Storage

We have published an integration report (available on Zmanda Network under the MySQL Backup section - free registration required) to show how you can deploy AWS Gateway to asynchronously upload backup files created by ZRM to Amazon S3.

As you can see, we have been busy updating our MySQL backup solutions. All of above improvements and feature additions have been done based on feedback provided by MySQL DBAs. If you are visiting the MySQL user conference this week, please do visit us at our booth - we would love to understand and discuss your MySQL backup challenges.

by ck at April 11, 2012 07:14 AM

Michael Still

Folsom Dev Summit sessions

I thought I should write up the dev summit sessions I am hosting now that the program is starting to look solid. This is mostly for my own benefit, so I have a solid understanding of where to start these sessions off. Both are short brainstorm sessions, so I am not intending to produce slide decks or anything like that. I just want to make sure there is something to kick discussion off.

Image caching, where to from here (nova hypervisors)

As of essex libvirt has an image cache to speed startup of new instances. This cache stores images direct from glance, as well as resized images. There is a periodic task which cleans up images in the cache which are no longer needed. The periodic task can also optionally detect images which have become corrupted on disk.

So first off, do we want to implement this for other hypervisors as well? As mentioned in a recent blog post I'd like to see the image cache manager become common code and have all the hypervisors deal with this in exactly the same manner -- that makes it easier to document, and means that on-call operations people don't need to determine what hypervisor a compute node is running before starting to debug. However, that requires the other hypervisor implementations to change how they stage images for instance startup, and I think it bears further discussion.

Additionally, the blueprint (https://blueprints.launchpad.net/nova/+spec/nova-image-cache-management) proposed that popular / strategic images could be pre-cached on compute nodes. Is this something we still want to do? What factors do we want to use for the reference implementation? I have a few ideas here that are listed in the blueprint, but most of them require talking to glance to implement. There is some hesitance in adding glance calls to a periodic task, because in a keystone'd implementation that would require an admin token in the nova configuration file. Is there a better way to do this, or is it ok to rely on glance in a periodic task?

Ops pain points (nova other)

Apart from my own ideas (better instance logging for example), I'm very interested in hearing from other people about what we can do to make nova easier for ops people to run. This is especially true for relatively easy to implement things we can get done in Folsom. This blueprint for deployer friendly configuration files is a good example of changes which don't look too hard to implement, but that would make the world a better place for opsen. There are many other examples of blueprints in this space, including:



What else can we be doing to make life better for opsen? I'm especially interested in getting people who actually run openstack in the wild into the room to tell us what is painful for them at the moment.

Tags for this post: openstack canonical folsom image_cache_management sre
Related posts: Reflecting on Essex; Further adventures with base images in OpenStack; Openstack compute node cleanup; Managing MySQL the Slack Way: How Google Deploys New MySQL Servers; I won a radio shark and headphones!; Conference Wireless not working yet?; Taking over a launch pad project; Off to the MySQL tutorials; Links from Rasmus' PHP talk; MySQL Workbench; Slow git review uploads?; Thoughts on the first day of the MySQL user's conference; MySQL cluster stores in RAM!; Wow, qemu-img is fast; Registered for MySQL User Conference 2006; Are you in a LUG? Do you want some promotional materials for LCA 2013?; Announcement video; linux.conf.au Returns to Canberra in 2013; The next thing; MySQL Users Conference; Managing MySQL the Slack Way: How Google Deploys New MySQL Servers

Comment

April 11, 2012 01:25 AM

April 10, 2012

Stefano Maffulli

You really want APIs designed by a community

I keep reading very strange things about OpenStack and its support for Amazon APIs. I believe it was generally acknowledged that APIs designed by one company were bad and that open API is really what you wanted. Today in Citrix kicks down door, breaks up OpenStack cloud party Matt Asay first quotes a superficial comment from Sam Johnston comparing OpenStack soon-to-be-obsoleted OpenStack governance model to Eclipse Foudation (suggestion: compare the draft documents for the OpenStack Foundation to Eclipse Foundation governance model).

Then Matt jumps on the favorite topics of the cloud pundits: API! It seems that Matt is cheering for one API, Amazon’s, like the history of Microsoft in the 90s didn’t teach us anything:

One aspect of [Rackspace's] agenda is a shift away from full API compatibility with Amazon’s API, which is one of CloudStack’s major selling points, and one of the big reasons it is striking off on its own. Rackspace could easily have followed this furrow, first plowed by Eucalyptus and VMops/Cloud.com/Citrix, but doing so would effectively cede the API battle to its bitter enemy, Amazon.

The first assumption is just wrong: Rackspace’s strategy is not the same thing as OpenStack’s strategy. Besides, Rackspace has two lines of business on OpenStack and those may not be 100% aligned.

Then, OpenStack supports Amazon’s API quite well, thank you. Such support is there to help companies move away from Amazon. I’ll have you read Jim Plamondon’s comment about how Amazon should not be allowed to follow Microsoft’s strategy (and Jim knows it very well).

OpenStack and the Free Software/Open Sourcei movement is in the rare position to be able to shape the future of computing, with an API that is designed by a very large set of companies, with lots of money too. This is not a small set of hackers trying to change the world: it’s BIG. OpenStack has the chance to set a real open standard for cloud computing while still allowing a compatibility layer as a migration path for all developers that are currently stuck on the proprietary API designed by Amazon behind closed doors. Companies like HP, Dell, Rackspace, Canonical and so many others are setting together a standard API, a truly open standard that I expect can also be easily ratified by a standards body, if/when needed.

I wonder why people claiming to be open source supporters cheer for the (quasi)monopolist and try to shut down at each occasion the effort of such large community to provide the world an open alternative. What am I missing?

Share and Enjoy:Identi.caTwitterLinkedInFSDailyFacebookGoogle BookmarksPing.fmemailTumblrdel.icio.usDiggRedditStumbleUpon

<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>
<script src="http://platform.twitter.com/widgets.js" type="text/javascript"></script>

Most Commented Posts


Stef for ][ stefano maffulli, 2012. | Permalink | No comment
Post tags:

by Stef at April 10, 2012 07:08 PM

Martin Loschwitz

Speaking at the Grazer LinuxTage 2012

Heads up those living in Austria: I'll be presenting the OpenStack cloud platform at this year's edition of the Grazer LinuxTage, which will be held on April 28 at the FH Joanneum in the City of Graz, Austria. More detailled information on my presentation can be found here:

http://glt12-programm.linuxtage.at/events/92.de.html

read more

by martin at April 10, 2012 11:18 AM

April 08, 2012

Joe Heck

6 months later… Essex

Easter Sunday and I’ve some time to sit back, relax, and spend the weekend doing a little cleaning and catching up. I’ve got to admit to some embarrassment that the last post here was the wrap-up of the OpenStack Diablo design summit.

To say it’s been a busy six months since the last post is really quite an understatement. What we released in the past month, and the work that went in to getting us there, is nothing short of phenomenal.

The OpenStack Essex release is out, and I’m still sort of catching my breath from that. Just six withs prior to the release I was elected as the PTL for Keystone. I had no idea how much additional work it was to wrangle the bugs, approvals, and the other sundry details that go into getting the release out the door. I rather wish there was a “new PTL user’s manual” for the process. It was a lot of learning, most of it last-minute and ad-hoc around our project management, release process and associated the mechanics. I spent a lot of time trying to figure out how to wrangle Launchpad into some semblance of useful tracking, etc. (To be fair, it was mostly a matter of learning what Launchpad could and couldn’t do, and figuring out the conventions that were already in place).

The past month was terrifically busy with getting OpenStack deployed and tested – running in small environments (devstack), and large. I’ve been working from the Ubuntu distributions myself, and the whole gang that has been packaging OpenStack (Chuck, Adam, Kiall, and others – inside Canonical and out) with the downstream distributions has done a tremendous job wrapping together the changes and making it into a deployable release from packages.

A much better release:

Where I was, er, more than disappointed with the Diablo, I’m significantly happier with the Essex release. Keystone (the project for which I am now the PTL) didn’t advance this release cycle so much as retrench and prepare for advancement this next release cycle. Horizon made huge leaps and bounds, which I think is fantastic as the face of OpenStack to many users. Nova and Swift advanced, and the integration work and definition going into making Quantum (and Melange) a reality has been terrific.

I think if I were to pull out a star of the show for the Essex release, then I’ve really got to point to the combined work for the crew formulating and keeping DevStack solid, and the CI team that integrated it into our development and review process so that we had a minimum of guaranteed interoperability. Where I was screaming into the wind during the last milestone of Diablo for continued breakages between the components, the integration of devstack has demanded that changes be rolled in with though to an overall use case and interop. With 200+ developers all kicking this ball down the field, the guaranteed CI has been the piece that has done us the most good.

A little about Keystone:

Keystone, as we retrenched and simplified the codebase, also has a significant advancement in integration testing. We replaced the entire codebase based on a series of integration tests that verified and guaranteed API compatibility with the Diablo and trunk releases as well as the client while we made that switch. The underlying code base is now significantly simplified, and the internal architecture of that service now wrapped around some core “internal service” concepts to allow us to have drivers that cleanly back-end into external systems to support identity and authorization.

I’ve received a number of questions about “why the API still sucks” six months later after Diablo. There’s still no obvious means for a user to “log out” (invalidate their own token), or change their own password (assuming the back-end were to support it). In switching out the underlying code base and architecture we needed to keep the API stable so that we could make sure we had the internals correct. With the internals switched out, it is now time to revisit the API and take the lessons we’ve learned from 6+ months of using the v2.0 API and improve upon it.

Looking towards the summit:

While I’m much, much happier with the Essex release, there are still plenty of places for improvement – many that I remember from the Diablo summit, and some new things that I’d love to see some focus on for the upcoming six months.

While Horizon has given us a dramatically improved user experience for OpenStack, there’s much more that we could do there – both with a web based UI, and with our command line interfaces to OpenStack. One thing that could use some explicit attention is the proliferation of “clients” needed to interact with OpenStack. As projects are splintering off Nova into their own domains, we have a number of new command line clients (nova, keystone, glance, quantum) – and they don’t all act the same. There is some great work on driving them to consistency, but I wonder if we shouldn’t bag all the individual clients and roll them together into one “OpenStack” client that is consistent in how it handles command line options, what the “commands” look like, and generally how you interact with them.

I hate to admit this, but even as the PTL of Keystone, I ran into a brick wall of old docs referring to using the command of “nova-manage project list” to see the projects, when I knew darn well and good they were all in the keystone system, and to see a list of them I should use the command “keystone tenant-list“. Not so hot, huh? Quantum and the nova-network components are going to really come into their own in this release, so we’ll have more shattering of the user experience from a deployer’s point of view unless we do something to be specific about tying it together.

There are more changes to be made in Keystone that I’m paying a lot of attention to – some of which are honestly going to require buy-in from the entire community to make happen. Where we place the boundaries of role based access, and how we deal with trust and information sharing about identity between the projects is probably going to need some changes that will ripple all the way down into the API’s of nova, swift, glance, quantum, melange, etc. Getting the brainstorming around those concepts and desired features is what I primarily want to accomplish at the Design Summit, with follow on in the milestone timeframe thereafter to bring the new features into existence and integrate them across OpenStack. My ideal goal is to get all the heavy features implemented early in the Folsom release cycle so that they’re solid and nailed down for the later milestones, and we can tweak and fix what may be broken long before any release points.

About that CloudStack thing:

Seems like in the last week, everybody (or at least the pundits) got their knickers in a twist about Citrix finally being more open about their dual-interest in CloudStack and OpenStack. Regardless of Gartner pundit sensationalistic babbling, the release of CloudStack as an Apache licensed project is nothing but goodness for the whole community. The Apache license makes me feel more comfortable with reviewing their code and looking at how they attacked the same issues and problems OpenStack has, and I expect they’ll look closely at how OpenStack has done the same. I applaud any company trying to build an open-source community, and I’m looking forward to seeing what Citrix does to actually do that community building. If it gains ground and folks contribute to CloudStack, we’re all better off – it’s more ideas hitting the street and becoming reality.

Finally, for anyone that didn’t think that the user-facing EC2 API was the defacto API, wake the F* up. It is reasonable, solid, and nobody is going to be turning it off any time soon. It also doesn’t allow/enable some things that a lot of cloud administrators would like to do – so it shouldn’t surprise anyone that CloudStack has their “api” and OpenStack has their “api”. The APIs are another place where we can learn from each other, as we add some value beyond what the bookseller down street might want to impose and expose.

If you want a little pundit action from this source, watch the API’s of both CloudStack and OpenStack (not just the EC2 compatibility layer) over the next six months. I think you’ll be seeing some very interesting steps forward.

by Joe at April 08, 2012 07:37 PM

April 06, 2012

OpenStack Blog

Community Weekly Review (Mar 30 – Apr 6)

OpenStack Community Newsletter – April 6, 2012

HIGHLIGHTS

EVENTS

REPORTS FROM PAST EVENTS

OTHER NEWS

COMMUNITY STATISTICS

To celebrate Essex release we publish the statistics assembled by Mark McLoughlin using gitdm, looking into the top 20 contributors across Nova, Glance, Swift, Keystone, Horizon and Quantum (even if Quantum will be officially part of OpenStack Folsom, not Essex):

Processed 3481 csets from 217 developers
100 employers found
A total of 421695 lines added, 256904 removed (delta 164791)
Developers with the most changesets
termie                     238 (6.8%)
Gabriel Hurley             207 (5.9%)
Brian Waldon               195 (5.6%)
Johannes Erdfelt           146 (4.2%)
Vishvananda Ishaya         116 (3.3%)
Dolph Mathews               98 (2.8%)
Dan Prince                  84 (2.4%)
Ziad Sawalha                80 (2.3%)
Jason Kölker               77 (2.2%)
Mark McLoughlin             73 (2.1%)
Jake Dahn                   73 (2.1%)
Rick Harris                 71 (2.0%)
Alex Meade                  70 (2.0%)
Trey Morris                 62 (1.8%)
Joe Heck                    58 (1.7%)
Chris Behrens               52 (1.5%)
Russell Bryant              50 (1.4%)
Eoghan Glynn                50 (1.4%)
Joe Gordon                  47 (1.4%)
Jesse Andrews               46 (1.3%)
Covers 54.380925% of changesets

w00t! Congrats and thanks to those for all their hard work on Essex!

This weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at April 06, 2012 06:51 PM

Jay Pipes

Ushering in the OpenStack Essex Release

As some of you may have noticed, the OpenStack community published its latest six-month release, codenamed Essex, this week[1]. As shown in the release notes, there’s a massive amount of change that comes in this release.

Some of that change is quite visible. For example, the dashboard project, code-named Horizon, was entirely overhauled and became a core OpenStack project in the Essex release cycle. The new Horizon is pretty stunning[2], if I may say so myself. Other visible awesomeness comes from Anne Gentle and the dozens of contributors who worked on the new API documentation site. It’s an excellent, and well-needed, resource for the community of developers who want to build applications on OpenStack clouds.

Other innovations weren’t so visible, but were just as impactful. The Swift development team added the ability for objects to expire, the ability to post objects via HTML forms with the “tempurl” functionality, and integration with the authentication mechanism in the OpenStack Identity API 2.0.

Under Vishvananda Ishaya‘s continued leadership, contributors to the OpenStack Compute project, code-named Nova, focused on a number of things in the past six months. Notably, on the feature front, floating IP pools and role-based access control were added. A variety of internal subsystems were dramatically refactored, including de-coupling the network service from Nova itself — something critical to scaling the network service with the Quantum and Melange incubated projects — as well as separating the volume endpoint into its own service. In addition, the remote procedure call subsystem was streamlined (again) and the way API extensions are handled in source code was cleaned up substantially. On the performance front, there were numerous bug fixes, but one that stands out is the overhaul of the metadata service that Vish completed. This one patch dramatically improves performance of the metadata service used by things like cloud-init when setting up new launched instances. You can see the entire list of 53 blueprints implemented and 765 bugs fixed in Nova in the Essex release here. Pretty impressive.

Over in the OpenStack Image Service project, code-named Glance, we focused on performance and stability in this cycle. With a fresh infusion of contributors like Reynolds Chin, Eoghan Glynn and Stuart McLaren, the Glance project made some dramatic improvements. Notably, Reynolds Chin added a visual progressbar to the Glance CLI tool when uploading images, Stuart McLaren submitted patches that enabled a significant improvement in throughput by starting the Glance API and registry services on multiple operating system processes. Eoghan Glynn fixed a massive amount of bugs and added new functionality revolving around external images and having Glance’s API server automatically copy an image from an external datastore. Brian Waldon, Glance’s new PTL (congrats, Sir Glancelot!), added RBAC support and did the heavy lifting of converting Glance’s image identifiers to a UUID format. Check here for the complete list of 11 blueprints implemented and 185 bugs fixed in Glance in the Essex cycle.

The Keystone codebase was entirely rewritten, causing some late cycle turmoil, but the team of contributors working on Keystone is dedicated to improving its stability and functionality in the Folsom release series. The new Keystone design should enable better extensibility and I’m confident the new PTL, Joe Heck, will work actively with contributing organizations to see Keystone make terrific improvements in coming months.

I’m sure there’s lots of names and stuff I’ve neglected to mention and I’ll apologize for that now! :) Here’s to a great design summit a week from now and a productive and cooperative Folsom release series. Thank you to all the OpenStack contributors. You are what makes OpenStack so special.

[1] In the OpenStack community — as in the Ubuntu community — we publish major releases every six months. We don’t hold up releases for a specific feature; if the feature isn’t completed, it simply goes into the next release when it is code-complete. In my opinion, this is one of the strengths of the OpenStack release model: it is predictable.

[2] What’s more, we can’t wait to introduce the goodness of the Horizon Essex dashboard into TryStack. We aim to get this done before the summit, but more on that in a later blog post.

by jaypipes at April 06, 2012 05:11 PM

Michael Still

Reflecting on Essex

This post is kind of long, and a little self indulgent. However, I really wanted to spend some time thinking about what I did for the Essex release cycle, and what I want to do for the Folsom release. I spent Essex mostly hacking on things in isolation, except for when Padraig Brady and I were hacking in a similar space. I'd like to collaborate more for Folsom, and I'm hoping talking about what I'm interested in doing in public might help with that.

I came relatively late to the Essex development cycle, having never even heard of OpenStack before joining Canonical. We can talk about how I'd worked in the cloud space for six years and yet wasn't aware of the open source implementations at some other time.

My initial introduction to OpenStack was being paged for compute nodes which were continually running out of disk. I googled around a bit and discovered that cached images for instances were never cleaned up (to start an instance, an image is fetched from glance, possibly has its format converted, is resized, and then an instance started with that resulting image, all those images were never being cleaned up). I filed bug 904532 as my absolute first interaction with the OpenStack community. Scott Moser kindly pointed me at the blueprint for how to actually fix the problem.

(Remind me if Phil Day comes to the OpenStack developer summit that I should sit down with him at some point and see how what close what was actually implemented got to what he wrote in that blueprint. I suspect we've still got a fair way to go, but I'll talk more about that later in this post).

This was a pivotal moment. I'd just spent the last six years writing python code to manage largish cloud clusters, and here was a bug which was hurting me in a python package intended to manage clusters very similar to those I had been running. I should just fix the bug, right?

It turns out that the OpenStack core developers are super easy to work with. I'd say that the code review process certainly feels like it was modelled on Google's but in general the code reviewers are nicer with their comments that what I'm used to. This makes it much easier to motivate yourself to go and spend some more time hacking that a deeply negative review would. I think Vish is especially worthy of a shout out as being an amazing person to work with. He's helpful, patient, and very smart.

In the end I wrote the image cache manager which ships in Essex. Its not perfect, but its a lot better than what came before, and its a good basis to build on. There is some remaining tech debt for image cache management which I intend to work on for Folsom. First off, the image cache only works for libvirt instances at the moment. I'd like to pull all the other hypervisors into line as best as possible. There are hooks in the virtualization driver for this, but no one has started this work as best as I am aware. To be completely honest I'd like to see the image cache manager become common code and have all the hypervisors deal with this in exactly the same manner -- that makes it easier to document, and means that on-call operations people don't need to determine what hypervisor a compute node is running before starting to debug. This is something I very much want to sit down with other nova developers and talk about at the summit.

The next step for image cache management is tracked in a very bare bones blueprint. The original blueprint envisaged that it would be desirable to pre-cache some images on all nodes. For example, a cloud host might want to offer slightly faster startup times for some images by ensuring they are pre-cached. I've been thinking about this a lot, and I can see other use cases here as well. For example, if you have mission critical instances and you wanted to tolerate a glance failure, then perhaps you want to pre-cache a class of images that serve those mission critical instances. The intention is to provide an interface and default implementation for the pre-caching logic, and then let users go wild working out their own requirements.

The hardest bit of the pre-caching will be reducing the interactions with glance I suspect. The current feeling is that calling glance from a periodic task is a bit scary, and has been actively avoided for Essex. This is especially true if Keystone is enabled, as the periodic task wont have an admin context unless we pull that from the config file. However, if you're trying to determine what images are mission critical, then you really need to talk to glance. I guess another option would be to have a table of such things in nova's database, but that feels wrong to me. We're going to have to talk about this bit more.

(It would be interesting as well to talk about the relative priority of instances as well. If a cluster is experiencing outages, then perhaps some customers would pay more to have their instances be the last killed off or something. Or perhaps I have instances which are less critical than others, so I want the cluster to degrade in an understood manner.)

That leads logically onto a scheduler change I would like to see. If I have a set of compute nodes I know already have the image for a given instance, shouldn't I prefer to start instances on those nodes instead of fetching the image to yet more compute nodes? In fact, if I already have a correctly resized COW base image for an instance on a given node, then it would make sense to run a new instance on that node as well. We need to be careful here, because you wouldn't want to run all of a given class of instance on a small set of compute nodes, but if the image was something like a default Ubuntu image, then it would make sense. I'd be interested in hearing what other people think of doing something like this.

Another thing I've tried to focus on for Essex is making OpenStack easier for operators to run. That started off relatively simply, by adding an option for log messages to specify what instance a message relates to. This means that when a user queries the state of their instance, the admin can now just grep for the instance UUID, and run from there. Its not perfect yet, in that not all messages use this functionality, but that's some tech debt that I will take on in Folsom. If you're a nova developer, then please pass instance= in your log messages where relevant!

This logging functionality isn't perfect, because if you only have the instance UUID in the method you're writing, it wont work. It expects full instance dicts because of the way the formatting code works. This is kind of ironic in that the default logging format only includes the UUID. In Folsom I'll also extend this code so that the right thing happens with UUIDs as well.

Another simple logging tweak I wrote is that tracebacks now have the time and instance included in them. This makes it much easier for admins to determine the context of a traceback in their logs. It should be noted that both of these changes was relatively trivial, but trivial things can often make it much easier for others.

There are two sessions at the Folsom dev summit talking about how to make OpenStack easier for operators to run. One was from me, and the other is from Duncan McGreggor. Neither has been accepted yet, but if I notice that Duncan's was accepted I'll drop mine. I'm very very interested in what operations staff feel is currently painful, because having something which is easy to scale and manage is vital to adoption. This is also the core of what I did at Google, and I feel I can make a real contribution here.

I know I've come relatively late to the OpenStack party, but there's heaps more to do here and I'm super enthused to be working on code that I can finally show people again.

Tags for this post: openstack canonical essex folsom image_cache_management sre
Related posts: Folsom Dev Summit sessions; Further adventures with base images in OpenStack; Openstack compute node cleanup; Managing MySQL the Slack Way: How Google Deploys New MySQL Servers; I won a radio shark and headphones!; Conference Wireless not working yet?; Taking over a launch pad project; Off to the MySQL tutorials; Links from Rasmus' PHP talk; MySQL Workbench; Slow git review uploads?; Thoughts on the first day of the MySQL user's conference; MySQL cluster stores in RAM!; Wow, qemu-img is fast; Registered for MySQL User Conference 2006; Are you in a LUG? Do you want some promotional materials for LCA 2013?; Announcement video; linux.conf.au Returns to Canberra in 2013; The next thing; MySQL Users Conference; Managing MySQL the Slack Way: How Google Deploys New MySQL Servers

Comment

April 06, 2012 02:19 AM

Ryan Lane

OpenStackManager 1.4 released

The OpenStackManager extension is a web interface for OpenStack, and a manager for a fully integrated test and development network being written primarily for Wikimedia Foundation use.

This release is mostly aimed at performance and usability. Here’s a list of changes:

  • Added a project filter. Rather than showing all projects, only projects selected in the project filter will show in the management interfaces. This should make the interfaces contain far less text, and should make interfaces load much faster.
  • Refactored the list pages so that styles can be applied to all pages easily. Applied a couple CSS styles globally across all of the pages. For instance, the table text has been changed to be top aligned, to make large tables easier to handle.
  • Merged in Platonides’s change for handling SSH keys uploaded in formats other than OpenSSH format. Keys in non-OpenSSH format will automatically be converted, if possible. If a private key, or a key in a bad formatted is uploaded, it’ll be rejected.
  • Changed the project section collapsing behavior. Rather than the project title collapsing the project’s section, a “Toggle” action will do so. The project name has been changed back to being a link to the project’s page.
  • Projects are now sorted alphabetically everywhere.
  • Various fixes related to the PHP aws-sdk.
  • Move creation forms to list pages for many management pages, to avoid extra clicks where possible.
  • Various memcache support additions and fixes.
  • Added a fix to allow user creation through MediaWiki interface.

If you’d like to help develop this extension, I’ve created a development environment in a project in Wikimedia Labs. Find me on #wikimedia-labs on Freenode or email me to get a labs account and access to the project.

by Ryan Lane at April 06, 2012 12:36 AM

April 05, 2012

Duncan McGreggor

Recent Stackiness

Meetup

Tomorrow is OpenStack Atlanta's second event (the first being a HackIn).  Ken Pepple is going to be talking about deploying OpenStack, something he should feel very comfortable doing, given his book as well as Internap's latest announcement :-)

There's more info about tomorrow's event at the Meetup page.

OpenStack Design Summit

In a couple weeks, a bunch of us from DreamHost are going to be heading to the OpenStack Summit and Conference in San Francisco. There's a lot of buzz about it both inside the company, in the offices of our fellow OpenStack collaborators, and in the wider open source community. With Citrix's recent announcement, Internap's deployment, Eucalyptus' approval by Amazon, there's plenty of Cloud Drama to go around. Fortunately, the focus of the Summit and Conference is on the important positives: how to improve an extraordinary piece of software and disseminating expertise. Can't wait!

GitHub Love

Last but not least, the Dev team at DreamHost has been using Github in conjunction with Launchpad in a manner similar to how the OpenStack project does it. The increased interest in open source software in our offices is starting to make its way out to our customers, and we've got a new web presence that is the first step in supporting this new direction. We're cooking up a bunch more stuff, so be sure to check in on our repos from time to time :-)


by Duncan McGreggor (noreply@blogger.com) at April 05, 2012 03:14 PM

John Dickinson

Swift State of the Project Spring 2012

The last six months of OpenStack swift development have been the most active six-month period for the project since the code was first put into production. The developer community has grown, the code has improved, and adoption has increased. The past six months have covered the Openstack “Essex” release cycle. During this time, swift has made five releases: 1.4.4 through 1.4.8.

Where We Are

The easiest way to get an overview of swift’s evolution is to look at the version control logs.

Swift has had 125 non-merge commits:

git shortlog -nes --no-merges 1.4.3..1.4.8 | awk '{SUM+=$1} END {print SUM}'

Greg Holt has been the most prolific commiter:

git shortlog -nes --no-merges 1.4.3..1.4.8 | head -1

Swift has had contributions from Rackspace, SDSC, RedHat, Nebula, HP, SwiftStack, Internap, Memset, CERN and others.

The three largest commits in the last six months have been for the formpost middleware, man pages, and the expiring objects feature:

formpost 7fc1721d7d5290a6af278f9b6844cd3b96b7c7c3
    (11 files changed, 3359 insertions(+), 16 deletions(-))
man pages 0b0785e984d9164c1d1cd84f05dd9909bb7d37a8
    (27 files changed, 3148 insertions(+), 0 deletions(-))
expiring objects 872420efdb8e6e945cd2fe06994136b8c2ee153a
    (20 files changed, 2043 insertions(+), 53 deletions(-))

But looking at VCS logs doesn’t tell the whole story. What is in these commits?

Several important new features have been added to swift. Swift now supports expiring objects, HTML form POSTs with temporary signed URLs, and the Openstack auth 2.0 API in the swift CLI. Other new features include new config options, optional functionality in middleware, and more ops tools.

Expiring objects allow a swift user to set an expiry time or a TTL on an object, after which the object is no longer accessible and will be deleted from the system. This feature enables new use cases for swift. For example, this feature could be used by a document managements system with data retention requirements.

The new formpost and tempurl middleware modules allow a swift user to create a URL with write access and then use that URL as the target of an HTML form POST. This feature is aimed at a control panel use case. Since swift uses an auth method based on information in request headers, browsers typically can’t access swift directly. With these two new middleware modules, someone building a swift control panel can have the browser directly upload content into the swift cluster. Since the requests are going directly to swift and don’t have to be proxied through the control panel web servers for auth, the control panel deployer only has to scale infrastructure based on the control panel usage, not swift usage.

In addition to new features, many bugs have been squashed as well. Swift developers have found and fixed memory leaks, improved data corruption detection, improved replication, and improved the way rings are built.

Swift’s documentation has also been greatly improved in the last six months. Thanks to Marcelo Martins, an ops engineer at Rackspace, swift now has a full set of man pages. Additionally, swift’s self-auditing tool (swift-recon) now has full documentation.

Beyond the code, swift’s community has grown quite a bit. In addition to many private deployments, several companies have announced public deployments or their internal usage of swift. Softlayer, Haylix, and Aptira have all announced public clouds that use swift. Wikimedia Foundation has announced that all thumbnails on wikipedia are now served from a swift cluster, and they are migrating all of their media files to a swift back end.

Swift now has fifty-nine contributors listed in the AUTHORS file. Twenty-seven have been added in the last six months. This is incredible growth (nearly 50%), and many of these new contributors come from companies that had not previously contributed to swift. This growth speaks to the increasing rate of adoption of swift and builds a strong developer base that will ensure swift’s success in the furture.

Where We’re Going

However, swift is by no means “finished” or “complete”. There are always bugs to fix and edge cases that can be handled better. There are new features and use cases that can and should be solved. Some examples include solving multi-site deployments and keeping very large containers performant. Both of these improvements will allow swift to grow beyond its current use case, but they involve tremendous complexity to implement well. It is unlikely that serious attempts to solve these issues will be attempted until they become pain points for swift deployers. As one of the swift developers said, “Swift has solved all the easy problems. All we have left are the really hard problems.”

The biggest challenges facing swift are not technical; they are about the developer community. Expect the swift community to continue to grow. More companies are deploying swift. More developers will be contributing to swift. A larger developer community will of course bring new challenges, but much can be learned from other Openstack projects like nova. Bringing more developers to swift will allow swift to become more robust and more adaptable to a wider variety of use cases.

The next six months for swift should bring more community education and a larger ecosystem. More companies will deploy swift, and their unique experiences will allow swift to become more robust and feature-filled. Swift’s future is bright as both public and private clouds continue to grow.

Storage is important. Everyone has data, and it’s always growing. You should have ownership of everything that touches your data. OpenStack gives you that power.

April 05, 2012 12:00 AM

April 04, 2012

Thierry Carrez

Ask not what OpenStack can do for you…

Over the last months I’ve seen more and more tweets and news articles using the formulation “OpenStack should”, as in “OpenStack should support Amazon APIs since it’s the de-facto standard”. I think there is a fundamental misconception there and I’d like to address it.

As a quick aside (and contrary to what the twittersphere sometimes report), it should be noted that OpenStack Nova always supported the Amazon EC2 API, and that OpenStack Swift grew an Amazon S3 compatibility layer last year. That said, I’ll be the first to admit that one could rightfully claim that the AWS API support in OpenStack is in less better shape than the OpenStack API support. But the reason behind it is not some “OpenStack strategy”, it’s a reflection of the participating companies focus.

OpenStack is a true Open Innovation project. It’s a collaboration ground where multiple companies are free to invest development resources to care about the stuff that is important to them. It’s an influence game where you need to donate developers to play: OpenStack is the playing field, not the players that push the ball.

Red Hat cared about QPID support, they fielded developers to make it happen in OpenStack. EC2 API support is originally in Nova because NASA cared about it. Then with the increase of Rackspace’s influence on the project, the OpenStack API grew faster. Now with Canonical (and others) interest, Amazon’s API support is getting better. Ultimately, code talks, and you can make things happen. That’s what makes OpenStack so appealing but also so confusing to the industry.

As “OpenStack”, we need to make sure the playing field is level (and hopefully the Foundation will be set up soon enough to address that) and that the code is modular and welcoming. But it’s up to the participating companies, which throw development resources at the project, to invest in what’s important for them or their customers. And maintain it over the long run.

So whenever you say “OpenStack should”, ask yourself if you shouldn’t really be saying… [Rackspace, Cisco, HP, IBM, Red Hat...] should. Ask not what OpenStack can do for you. Ask what you can do for OpenStack.


by Thierry Carrez at April 04, 2012 02:08 PM