VMware vExpert 2017

vExpert is a global recognition provided by VMware for having demonstrated significant contributions to the community and a willingness to share expertise with others. The vExpert group is responsible for much of the virtualization evangelism that is taking place in the world — both publicly in books, blogs, online forums, and VMUGs; and privately inside customers and VMware partners.

I am very honored for the second year in a row to be named VMware vExpert 2017.  You can check out the full announcement here – http://vexpert.me/YX

vmw-logo-vexpert-2017-k

Congratulations to my friends Tom Cronin @virtual_tom and Joe DePasquale @DePasqualeJoe for being awarded vExpert with me.

My New Journey – Into the Cloud

Happy New Year everyone! Over the last few years my work concentration has been, and still will be to a point, working to engineer solutions for my companies virtual infrastructure, and over the last couple years specializing in the vROPs suite of applications and now with the new year here, I have been assigned to work on what my companies direction and strategy will be for our cloud initiative. This doesn’t mean I am moving away from VMware, quite the opposite, I will be digging deeper into what VMware offers as a cloud platform and provider. I will also to some extent still have involvement in ESXi, vSphere, NSX and vROPs. My larger team still covers all of those products; however my focus will shift to private/public cloud. How cool is that right!? This is the industry trend and there is so much opportunity to be involved in engineering some awesome solutions and further grow the cloud footprint (cloudprint??) at work. Its not all going to be me, ill be part of a team, all of us working to provide real value. I’m very excited and feel privileged to be part of this movement.

So what does this all mean for this blog. Not much, except more cool and informative info as I work my way through this. Ill still focus on providing articles and write ups of solutions and information on how VMware deeply integrates with our cloud solutions, and the integration of VMware products. I betting you will still see some vROPs and capacity management stuff too as I transition. Over the last year the articles I have written here have been based around actual challenges or issue or decisions I have had to work through. Expect the same theme, just a focus change to cloud. I love sharing info with my VMUG/VMware community and I hope you will find the info I post will continue to interest you and maybe even help. 🙂

See ya in the clouds!
Dan @anothergeek

Considerations for Capacity Management with vROps

Navigating your way around capacity management is not and easy task, especially at a large company where it seems almost impossible to get your arms wrapped around it. HA – I picture a large tree and trying to hug it, not quite able to lock your fingers on the other side! It’s really kind of like that. You got most of it, but you are always reaching. At times you need to step back and re-evaluate your angle or approach. Over the last year or so I’ve been working with the capacity management team to choose exactly the right metrics to determine the best way to evaluate capacity. Last week one cluster, according to vROPs, was in desperate need of capacity, we were running into our buffers; however when we looked closely in our review meeting we noticed that the reason we were out of capacity was due to CPU Demand. This spun off a number of weekly meetings to re consider our approach or angle to see if we can get our fingers locked. In all honesty, this wasn’t an oversight, we have a pretty smart group of people and we meet regularly to review. Everyone on our team has the same goal and these types of discussions make sure we are staying on target; however we did realize that we needed a deeper understanding of the different types of capacity models and how to apply them as policies across the virtual infrastructure. So let’s start with a quick level set and go from there. All right, here we go!

Allocation Model
This model is capacity based on the configured amount of resources assigned to a VM or VMs in a cluster. The consensus is that this model should be used for production environments where you have important workloads, and you want to be able to keep resources for fail-over, and you want to make sure you don’t over commit by too much. You decide your over commitment ratio and set that in the policy. This is the most conservative capacity model.

Demand Model
The Demand model is often used in Test/Development environments where you don’t necessarily care about over allocation, and you really want to get as many guests as possible in the environment. If you are using this model you probably don’t care if the hosts are running hot. You will likely be way over allocated but again you don’t care because you want to run this for highest possible VM density.

Memory Consumed model
This model allows you to see the memory resources used just like you would in the vSphere client. It shows the active memory, plus shared memory pages, plus recently touched memory. All the memory overhead.

So which one do we choose? That’s an excellent question. In all likely hood, we are going to look at all these models and how they affect capacity. We have, and I’m guessing you do too, clusters with mixed workloads or due to licensing considerations clusters where you have to mix test/dev hosts with production hosts. So its not so easy to just pick one or the other and go with it, especially when you have to scale up the environment to meet the needs of the company. Our team decided to start to implement different policies specific to the cluster and workloads in those clusters. The polices will include different allocation over-commit ratios for CPU/Memory and Disk. Some policies will account for all three models others will just be one or a combination. What’s really great is vRealize Operations is so flexible its really easy to dial in capacity just the way you want it. One other decision we made that you might want to consider is that we will only rely on the data in vROPs for capacity management. We wont look at what vCenter is showing for cluster resources used to determine if we can “fit” more VMs in. Capacity management is not easy, it takes time to collect metric data, analyze it and then tweak it so you are sure you can make the best decisions. Sometimes those decisions can save (or cost) your company a significant amount of money. The good news is there is no magic going on there. If you put in the work and use a great tool like vRealize Operation Manager you will get to a point where real value will be realized with vROPs. Now that our team has determined to use a combination of models, we can then begin to adjust policies and review data that’s already been collected to make sure we are using metrics that meet our needs. I’d love to hear how others are using vROPs to determine capacity and some of the challenges and success you have encountered. If you read this and want to share, add a comment.

I’d like to thank Hicham Mourad for his help with some questions and his guidance along the way. He is a really smart guy, and Im thankful I can reach out to him when I need to. 🙂

VMware Announces General Availability – vSphere 6.5

Today VMware announced general availability for vSphere 6.5. Im really excited about this release, not only for vSphere 6.5, but I am also really looking forward to test out VSAN 6.5 and VROPs 6.4 along with Log Insight 4. All these products will go into my home lab first for me to play around with and try out new features, then Ill start to upgrade our Engineering Test lab at work, putting everthing through its paces, then onto test/dev/cert and into Prod.  I’m really interested in the new fully supported HTML-5 Web Client, and the predictive DRS features. I also want to check out the new appliance managemnt and update manager.

Happy testing! 🙂

Here is the link to the Official GA Announcement

Manually Increasing vSphere Web Client Heap Size

The other day when I was building a vSphere 6.0 environment up in my lab for testing I ran into an issue where performance was extremely slow in the web client and I was continually receiving an error that the VMware-dataservice-sca and vsphere-client status would change from green to yellow.  When I deployed the VCSA/PSC appliance I choose “Tiny” as the size option.  Even though my implementation is going to be under the 10 hosts and 100 VMs, I think this build was just not enough, and performance in the web client was just really lacking.  Searching the VMware KB I came across 2144950.  I found out this is a known issue affecting vCenter Server 6.0.  Here are the steps that I used to work around the error and gain performance back in the web client.

First I added additional RAM to the appliance.  Pretty straight forward, no magic there.  Then I used SSH to connect to the appliance and ran the follow command:

cloudvm-ram-size -C XXX vsphere-client

Replace the XXX with the size in MB that you want to increase the heap size.

If you are running a Windows  vCenter Server, find C:\ProgramFiles\VMware\vCenter Server\visl-integration\usr\sbin\cloudvm-ram-size.bat and run this command:

cloudvm-ram-size.bat -C XXX vspherewebclientsvc

Again swap out the XXX with the size in MB that you want to increase the heap size. Don’t forget to restart the vSphere client service.

Removing a PSC or vCenter Server in vSphere 6.x

The other day I’m bring up another vSphere 6.0 environment for our VDI team in our engineering test lab and for some reason im having all sorts of issues.  I’m installing a VCSA with embedded PSC and connecting it to and existing SSO domain.  I have no idea whats going on, its going horrible. One time the install will fail, then the next it will complete, but enhanced linked mode is just acting weird….Well unbeknownst to me the QIP team decided to cut over DNS to new appliances and that was reeking havoc across the environment.  So now that I’ve killed (I kid) the guy who was doing this I’m left with a mess to clean up.  Finally DNS is working properly so I’m going to re-deploy the PSC/VCSA again but before I do that, I have to clean up the one that I don’t want anymore. Lucky for us its a pretty easy job.

The first step I had to do was make sure that my appliance was powered down.  I knew that there was no other VCSA pointing to this PSC.  If you are unsure if any other vCenter is connected to the PSC you are removing, you can check by logging into the vSphere web client and go to the advanced vCenter server settings and look for a property called config.vpxd.sso.admin.url and the value of this setting is the PSC the vCenter server is using.  If you find any other vCenters VMware has KB 2113917 to help you re point your vCenter to a different PSC.

Once that is all sorted out, next we need to connect to another PSC is the same SSO domain via SSH and run the following command:

cmsso-util unregister --node-pnid Platform_Services_Controller_FQDN 
--username administrator@your_domain_name --passwd vCenter_Single_Sign_On_password

After that completes, delete the appliance from your inventory and check in Administration -> System Configration -> Nodes to make sure that its not listed there.

Removing a VCSA is just about the same as above just have to make one change in the command:

cmsso-util unregister --node-pnid vCenterServer_System_Name --username 
administrator@your_domain_name --passwd vCenter_Single_Sign_On_password

If you need some additional info on these steps, check out KB 2106736

Project Home Lab – Part 1 – Hardware

A home lab can be a great resource for any App Dev,  Sys Admin or Engineer.  Its a great tool to learn about the products you are responsible for. I believe the value it will return to the company that you work for is ten fold. Think about it, you are learning at home on your own time, then bringing that knowledge back to your job to apply it towards development projects or support. Its really a win-win. My engineering team and I proposed to our department management a project to provide home labs to our engineering and app development teams. We thought it would be a great way to bridge the communication gap between the two teams and help reduce or eliminate shadow IT. One of the challenges we have come across working in a large company is knowing exactly what our development teams need to perform their job.  Is it containers, OpenStack, or just some other product that allows them to move their projects and initiatives forward?  The answer is probably yes to all or any of those questions and its more than likely already running under their desk.  Our thought was to give the various teams a supportable (internal support) platform to work creatively and learn. Also a direct line of communication from App Dev to Engineering without going through the traditional channels.  We are hoping this will provide a quicker turn around time to engineer the infrastructure to meet the needs of the developers and give them the tools they need.  At least that’s the theory behind this pilot project.  Over the next few posts ill share some of the cool things Im doing with my home lab and ill also let you know any feedback I receive from the teams using it and management.  So lets get to it!

Part 1 – Hardware

When my team first set out to select the right hardware we looked around the internet.  There are many choices and flavors of a home lab to choose from.  Gone are the days where you need some big honking old decommissioned servers that suck power and cause your wife to complain about the sound and cost of electricity.  Today’s home lab is small, quiet, powerful and efficient, and can provide a number of configuration choices for testing all sorts of builds and designs.  Our requirements were pretty simple and standard.  We wanted vSphere (no duh!) and some VSAN (yeah baby!) and a whole bunch of extra storage.  Here is the run down of what we decided to get.

  • 3 Intel NUC kits (NUC6i5SYH) – that’s a Core i5 6260U 1.8 Ghz processor
  • 3 Crucial DDR4 32GB (2×16) DIMM kits – Each NUC will get 32GB of Memory
  • 3 Samsung 850 EVO (MZ-75E2T0B) 2TB 2.5″ SSD SATA 6Gb/s – one for each NUC
  • 3 Samsung 850 EVO M.2 (MZ-N5E120BW) SSD SATA 6Gb/s – one for each NUC (VSAN Cache)
  • 3 StarTech USB 3.0 to Gigabit Ethernet NIC adapter – VSAN traffic will go over this NIC
  • 3 Kingston Data Traveler G4 – USB Flash Drive 8GB – We’ll install ESXi and Boot from them
  • 1 Synology DiskStation 5 Bay DS1515+ NAS Server
  • 5 WD Red Pro NAS Hard Drives (WD8001FFWX) 8TB SATA 6Gb/s
  • 1 Linksys (SE3016) 16 port unmanaged Switch

Full disclaimer here…. I did not purchase the hardware with my own money, my company purchased the hardware for a pilot home lab project I previously mentioned. So yeah I know what you are thinking, what a deal.  I agree, but I really believe my company will get value back for the purchase and with some conditions, they seem to believe that also. You should expect to also purchase licensing. The licenses I’m using are my own.  I have a VMUG Advantage, MSDN and I also get some free VMware licenses for being a vExpert.  Really look at VMUG Advantage, its the best option and its very affordable.  It goes without saying don’t use your production licenses. Info on VMUG Advantage can be found HERE. If your budget is tight, no worries, you can easily scale down (or up) to meet your needs and with all the options out there you should be able to build a really decent home lab.

All the hardware went together really nicely.  I have to be honest, putting together all that stuff really gets my geek flag flying. Its almost a religious experience.  Takes me back to when I was a kid home building PCs; but I digress. :).  You shouldn’t really have any issues connecting all the pieces.  One of the guys on my team did have one NUC only see 16GB of memory, he just needed to reset one DIMM and that was fixed.

In Part 2, Ill go over some design considerations and build out.