Tuesday, February 14, 2017

Scandinavian SCOM Solutions with a Global Reach

A few months before the Christmas break, I had the pleasure of being invited over to the excellent SCOM Day event in Sweden to present a session and hang out with some of my friends from the Scandinavian region.


The event was organised by Approved Consulting in Gothenburg and the target audience had a mix of IT administrators, consultants and senior IT managers. This was my first-time visiting Sweden and from the venue, to the food, the craft beers and of course, the people, it was a really enjoyable experience.

While I was over there, I had the chance to sit down with Approved CEO Jonas Lenntun and go through some of the solutions they offer to complement System Center and OMS. I was already aware of the free community SCOM Health Check Report they released a couple of years ago (if you haven’t tried this out yet, then download it from here):


Free solutions like this for SCOM are always good and the Health Check Report delivers an excellent overview of the health of your SCOM deployments - showing you information about the top alerts, events, performance counters, discoveries and even state changes along with database space usage and grooming history.

IT Service Analytics from Approved

Another cool solution that Jonas and the guys have been working on is their new IT Service Analytics platform. This plug and play solution enables organisations to analyse their IT services being monitored with SCOM and then forecast potential issues – well before they occur. If you’ve deployed Service Manager (SCSM) or even Microsoft’s new Operations Management Suite (OMS), then the IT Service Analytics platform can pull data from any combination of SCOM, SCSM and OMS to give you an even deeper analysis of your IT estate.

Here’s an overview taken from their blog on how it works:

By optimizing and combining data from System Center Operations Manager, Microsoft OMS and System Center Service Manager into one holistic data model, you are able to put the IT service in focus. This allows you to extract, correlate and predict information about IT Service Management processes for things like event, capacity, availability, incident and change management.

We utilize most of the Microsoft Business Intelligence tools, such as SQL Server, SSIS, SSAS, R and SSRS. This allows our analytical platform to seamlessly blend with your System Center installation and tap software and hardware resources that are readily available.



Taking it for a Test Drive

Earlier this week I had a chance to take the IT Analytics platform for a test drive and my first impression is that it’s an awesome reporting tool to have in your locker to help with troubleshooting and predictive analysis.

From the home screen, you can choose from a wide range of pre-built reports with information about alerts, capacity management, events, configuration changes and IT service overviews to name just a few.


One of the reports I really like is the Services report. Clicking this tile from the main reports window brings me to the Service Overview shown in following image:


This report gives me a 30-day availability overview of all the IT services that I have modelled and monitored in my SCOM environment along with information about alerts, change tracking, capacity and predictive event risks.

Here’s a description of what the information in each of the report columns mean:

  • Goal – Has the SLA goal been met or not? IT Services that have met their SLA will be displayed as green instead of red (in this demo environment, I’ve sorted the column to display all SLA’s that haven’t been met).
  • Service – The name of the IT service.
  • Availability – Displays the last 12 months of the IT service availability.
  • Percentage – The SLA percentage that has been reached. The upwards arrow means that the SLA has reached a better result than the previous month.
  • Failures – The number of outages for the service during this period.
  • Downtime – Displays the number of minutes the service has been unavailable for the month.
  • Alerts – The number of alerts that have been generated by the service during this defined report period. The arrow shows decreasing or increasing compared to last month.
  • Events – The number of events that have been generated by the service during this period. The arrow shows decreasing or increasing compared to last month.
  • Change Tracking – The amount of changes made to servers or other components of the service.
  • Capacity Risks – Shows if there are risks with capacity, such as a server running out of free memory based on the usage.
  • Event Risks – Shows if there are any predicted events for the service.

Identifying Bottlenecks

When I drill into a particular IT Service from the Service Overview report, I get a more targeted Service Details report with a number of informational tiles and a Top N view of common KPI’s like % CPU, % Memory and % Disk Space used.

The Bottlenecks tile sparked my interest here so I clicked this one first…


This brought me deeper to the following view – where I could see that two of my servers in this IT service were displaying potential bottlenecks.


Clicking into the server with two potential bottlenecks identified, I was then presented with a performance chart that showed a very high percentage of bandwidth used on a new network adapter we recently installed into the server to support DPM backups. The performance chart also confirms for me that although my network adapter spiked on and off for the past few days (no doubt when backup jobs are running), the overall average performance of it seems fine and it’s projected to stay around the 10% utilisation mark for the next few months.


The other potential bottleneck that was identified relates to the % Free Disk Space of a logical disk on the Hyper-V server. I can see from the chart that in the past year, the free disk space on this logical disk has fluctuated from approx. 30% free to a minimum value of less than 1%. The chart looks ahead a few months and predicts that the best I can hope for (assuming I leave things as they are) is no more than 7% free disk space.


Predictive Alerts

Back at the Service Details report, I can click the Events tile shown in the image below to give me an Events Report with a heads-up on the forecasted events and alerts that are likely to occur in my environment within the next 24 hours.


All Alert and Event reports have built-in filters for every chart to give you a more scoped analysis view of what's going on. From the Event Report shown in the image below, I can see there are some predicted alerts and events that I need to pay attention to.


Drilling further into the predicted alert value for a particular monitored object, I’m presented with a ‘IIS 8 Web Server is unavailable’ alert that´s been predicted and the amount of times it has happened over the last month. I can see the time of day the alerts usually show up. In this example, these alerts typically occur around 6am every day.


If I go back to the previous view and click into the Events tile, I can see it’s broken down into three sections.

The first section is a summary where you can see information on the top hosts, data channels, rules, management packs etc. which are generating the most events. In the image below, we can see that the server generating the most events is SEGOTSQL01. The grey bar in the middle displays last month´s value. You can also see that this server alone has generated 88% of all events for the current period.


The middle section of this report displays the time and day of the week that the events are generated.


The final section of this report gives us an insight into both the last 30 days and the last 12 months for how events are being generated.


Custom Reports

It's easy to create your own custom reports and you can export them to PowerBi or Microsoft Excel in a matter of minutes. Here's a nice example of one-such custom exported report...


Licensing

I mentioned earlier that I love free solutions for SCOM and when I quizzed Jonas on how much this awesome offering costs to license, I was delighted to hear that Approved have decided to release it for free! They do require a one-off nominal setup and training fee but aside from that, there's no other limitations on the platform.

Summary

If you're interested in deploying these free solutions into your SCOM environment, then use the contact info here to get in touch with the team at Approved. For more information on the IT Analytics platform, take a read of some blog posts written by well known SCOM community blogger Daniel Örneling here and here.



Monday, January 30, 2017

Update Rollup 12 (UR12) Just Released for SCOM 2012 R2

Today, Microsoft released Update Rollup 12 for SCOM 2012 R2. This update contains a decent number of fixes along with some new enhancements for both Windows and cross-platform monitoring scenarios.


A full list of all the fixes and enhancements can be seen here:

https://support.microsoft.com/en-us/help/3209587/system-center-2012-r2-om-ur12


I've yet to deploy this update into my lab but I'm particularly intrigued by this one:

  • Because of incorrect computations of configuration and overrides, some managed entities go into an unmonitored state. This behavior is accompanied by event 1215 errors that are logged in the Operations Manager log.

I've noticed managed entities going into an unmonitored state after applying overrides or changing distributed application configurations a lot over the last couple of years and it'll be interesting to see if this update sorts out the issue.

As should be the case for everyone deploying this update, test it in non-production environments first and be sure to read through Kevin Holman's excellent step-by-step guide to understand the order for which to apply the update and the additional manual steps that are needed:

https://blogs.technet.microsoft.com/kevinholman/2017/01/30/ur12-for-scom-2012-r2-step-by-step/


Friday, January 27, 2017

KB 3216755 contains a fix for Windows Server 2016 monitoring with SCOM

I've just noticed a new Microsoft KB article (KB3216755) that points to an update that contains a fix for some scenarios where monitoring Windows Server 2016 or Windows 10 might fail.


I haven't come across the exact scenario (yet) that this fix applies to, but it's useful to know there's an update that can help if you run into problems.

Also, be sure to take a read of the 'Known Issues' section towards the end of the KB article where it states:

"The Cluster Service may not start automatically on the first reboot after applying the update."

The workaround for this known issue is to either use PowerShell to start the cluster service on the node you've applied the update to or simply to reboot the node once more.

Check out the KB article here for more info:

https://support.microsoft.com/en-us/help/4011347/windows-10-update-kb3216755


Wednesday, December 7, 2016

SCOM - The Topology Widget, Visio and a suped-up HD display!

Recently, I ran into an issue while creating some dashboards in the SCOM console for a customer and I thought it might be worth sharing.

Normally I use the Topology Widget to light up an image file that I initially put together using Visio and the end-result typically turns out something like this…


The difference this time though was that I’ve been using a new Windows 10 laptop that has some pretty awesome specs and a kick-ass HD display. The downside of having a laptop with Windows 10 and these specs is that application scaling becomes a nightmare and there’s a whole merry-go-round of custom tweaks that I needed to make when I started using it so as to deliver an experience where I don’t need a giant magnifying glass to work!

Here's how I have my Windows 10 laptop scaling settings configured (notice the 250% size setting)..


With these scaling settings in place on the new laptop, I went about my business by first creating a new dashboard image in Visio and then saving it as a PNG file before finally importing the file into SCOM.

When I worked my way through configuring the Topology Widget wizard to map my custom IT services (Distributed Applications) onto the image, the dashboard disappointingly turned out like this...


The problem with this dashboard view is that its grainy quality and tiny health state icons make it hard to read and understand. I've created hundreds of these dashboard views in the past and this was the first time that I've encountered a problem like this so it was time to dig a little deeper to find the solution.

The first thing I tried was to copy the problematic PNG file to another SCOM environment and create a new Topology Widget dashboard there. In this separate environment, the grainy image and tiny health state icons were still there so the problem pointed to an issue with the PNG file.

Another test I tried was to import a completely different dashboard PNG file that I knew worked fine in another customers environment and thankfully this displayed as expected. With this validation, I was confident that I was dealing with an issue either with the original problematic PNG or the Visio image that I created the PNG from.

As I traced back through my steps, I opened the Visio file again that I created this dashboard in and clicked the the Save As option from the File menu to save it as a new PNG. When I did this, I was presented with the following PNG Output Options window:


Notice the default Resolution and Size settings Visio 2016 selects for me when I go to save a new PNG file. I figured that due to the 250% display scaling option that my laptop was configured with, these settings were creating the PNG file at too high a resolution for SCOM to work with.

I went back to the original problematic PNG file and checked the Image Properties and I could see that it was configured to use 2044 x 1548 pixels as shown here....


When I checked the other dashboard PNG file that I knew worked (and which I created on my old laptop), I could see that it was configured to use a much lower pixel size.

So, back to the Visio diagram of my new dashboard and this time, when I clicked the Save As option from the File menu, I manually configured the PNG Output settings to use a resolution of Source and a pixel size of 1123 x 794 as shown in this image...


When I imported this new leaner version of the PNG file back into the same Topology Widget, I finally got the results I was looking for where the health state and image quality were far easier on the eye.

Hopefully this easy fix helps someone else out with their future SCOM dashboard creations!


Tuesday, November 29, 2016

The Most Useful SCOM Article on the Web Just Got an Update!

As anyone who's ever worked with SCOM will know, it's a fairly heavy and complex product to get your head around at first and the larger the environment to be monitored, the more administration and troubleshooting tasks you'll need to teach yourself.


Way back when I started working with SCOM, I quickly found myself lost in a myriad of blog posts and TechNet articles searching for help on how to extract information from the SQL databases to help me better understand the problems I was experiencing.

The one thing that kept coming up trumps for me in my searches time and time again was Kevin Holman's 'Useful Operations Manager 2007 SQL Queries' post. This post brought together a virtual treasure chest of SQL queries that the 'non-SQL admin' like me could easily copy and paste into my SQL Management Studio window for instant information or configuration changes in my customers SCOM environments.

It was probably the first SCOM reference on the web that I saved as a favourite into my web browser and was always a location that I'd tell new SCOM admins to go check out and bookmark.

As the title of Kevin's post suggests, it was originally put together nine years ago as a central repository of SQL queries for SCOM 2007. When System Center 2012 and ultimately 2016 came around, these queries still worked with the newer releases of SCOM but there was often some confusion from people trying to understand if they only worked with SCOM 2007.

So to address this, just recently Kevin took the time to archive his original 2007-named post and create a new one titled simply 'SCOM SQL Queries'.


Not only has he renamed the post but he has also formatted it in a way that all queries are now much easier to read from and copy/paste as required.

Check out the new location for what is most likely, the most useful SCOM article on the web here:

https://blogs.technet.microsoft.com/kevinholman/2016/11/11/scom-sql-queries/


Tuesday, November 22, 2016

Experts Live NL 2016

Today I've just finished up presenting my last public conference session of 2016 at the awesome Experts Live conference in the Netherlands.


This is my second year to attend Experts Live NL and it already seems like the conference attendee and speaker count has grown significantly in that short space of time.

My presentation this year was titled 'Hacking OMS with your OpsMgr Skills' and is an extension of the session that I co-presented with my good friend Cameron Fuller at System Center Universe 2016 in August.

The original idea and title for this session was all Cameron's and with his blessing, I put my own spin on the content to ensure that Experts Live attendees were treated to a significantly different version of the one we delivered previously at SCU. Also, with the vast number of changes and feature additions that we've now become accustomed to with OMS, there was much to show on the day.

My session was the first to open after the keynote and it was refreshing to see the room filled with a large number of current OpsMgr users waiting to hear how to advance their skillsets with OMS.


(Photo credit Pedro van Vliet)

When my presentation was done, I took some time to hang out with old friends and to network with the attendees and various booth vendors around the event.


All in all, Experts Live NL was a good closure for me to a hectic few months of traveling and presenting. I'm looking forward to now refocusing my attention back onto my poor neglected blog and bringing some useful posts into the community over the coming months!


Wednesday, October 19, 2016

Important SCOM 2016 and 2012 R2 Updates!

If like me, you've jumped aboard the SCOM 2016 bandwagon and started deploying the recently released GA version to your production environments, then you'll need to be aware of two very important updates that need to be added ASAP.

The first one is Update Rollup 1 for SCOM 2016:

https://support.microsoft.com/en-us/kb/3190029

Microsoft have recommended that people deploy this update rollup immediately after deploying the intial SCOM 2016 GA build as it contains fixes for a number of issues that were recently highlighted by users of the Technical Preview 5 release.

The next update is better identified as a patch (KB3200006) that Microsoft needed to quickly release in response to a widespread spate of console crashes on both SCOM 2016 and 2012 R2.

People are understandably frustrated at these crashes as you can read from here and here.

You can get access to the new patch that (hopefully) fixes this problem from the following link:

https://support.microsoft.com/en-us/kb/3200006

Hopefully this helps people out and feel free to use the comments section below (or add your thoughts to the TechNet forums mentioned above) if this patch doesn't solve the console crash issue for you.


Thursday, October 6, 2016

Updated: SCOM 2016 & 2012 R2 Prerequisites Script

Last year when I was starting work on my new Getting Started with Operations Manager book, I needed a PowerShell script that would help me deploy the SCOM 2016 and 2012 R2 prerequisites without fail every time.


The script was a derivative of an earlier SCOM 2012 SP1 script that I published a few years back and it worked fine up until the download link for the ReportViewer prerequisite changed to support Windows Server 2016. I had it on my to-do list to update this script to reflect the new download link but before I got around to it, I noticed that my good friend (and the tallest Dutch guy I know) Oskar Landman had taken my original script and added his scripting magic to it!

Oskar's updated script now has interactive prompts to check which version of SCOM you're installing and whether or not you are deploying the Web Console role (which requires the most prerequisites) - awesome!



Taking your inputs from those prompts, it will then go and download the SQLSysClrTypes and ReportViewer prerequisites to a folder of your choice, install them and then deploy all required roles and features based on your input - nice!

You can review Oskar's original blog post about his work on this script here.

The updated script can be downloaded from its original TechNet Gallery location here:


Sunday, September 25, 2016

Looking back on System Center Universe Europe 2016

A few weeks back I had the honour of presenting again at the annual System Center Universe Europe conference - which was held this year in Berlin, Germany.


This was my fourth year presenting at System Center Universe Europe and I can honestly say that the conference just keeps getting better and better each time. This is mainly due to the epic amount of time and effort the team over at itnetX dedicate to organising it.

I went into this conference initially with two sessions to present. One was a solo effort titled 'What's New with OpsMgr 2016' and the other was a joint presentation with the one and only Cameron Fuller titled 'Using your OpsMgr skills to hack OMS'.

A few short hours after landing in Germany, I ended up adding another presentation to my list. This third one was another joint presentation with Cameron Fuller and it came with the awesome title of 'OMS & OpsMgr: Mortal Enemies, Casual Acquaintances, Best Friends, or Inbred Cousins?' - only Cameron could think up a fun and quirky title like this!

Here's the low-down on how each day over there went for me:

Pre-Conference
The day before the conference began, a few of us (myself, Cameron, Robert Hedblom and Janaka Rangama) decided to do some sightseeing and took a cab over to the Berlin Wall Memorial. Although this is one of the top sights to see when you visit Berlin and definitely something on my bucket list to check out, the sombre historical significance of the wall was never far from our minds.


After a few hours soaking up some culture, we walked through the various meandering side-streets of Berlin until we came across the conference centre that would host this years System Center Universe event. Located right in the middle of Berlin's Alexanderplatz, it wasn't hard to miss due to the large SCU flags that greeted us on arrival at the front of the building.


After a quick recon mission of the rooms we would be presenting in and the overall venue, we all agreed that this was going to be a good week ahead.


When we left the conference centre, we decided (actually Robert decided) to get some food accompanied with some local beverages. Being in Germany, it would've been rude to order anything smaller than this as a beer to wash down the local staple dish of Currywurst...


Another great thing about this years host city is the fact that everything is so central and after catching up with my geek friends, it was only a short night-time walk back to the hotel - which was always easy to navigate back to due to the prominence of the building on the Berlin skyline!


Day 1
We had a bright and early start on the Wednesday morning as Marcel Zehner kicked off the opening keynote with a run-down of the few days ahead (including the all-important party list!)


When Marcel got the formalities out of the way, it was straight into tech with special guest Ed Wilson (aka The Scripting Guy). Ed delivered an awesome presentation on how to approach traditional IT challenges in a hybrid IT world.



Straight after the keynote, myself and Cameron headed over to our room to get ready for our first co-presented session titled 'OMS & OpsMgr: Mortal Enemies, Casual Acquaintances, Best Friends, or Inbred Cousins?'...


It wasn't long before the room filled up and in true Cameron style, he kicked off the presentation with a pre-recorded video of songs and images that represented mortal enemies, casual acquaintances, best friends and inbred cousins!

The interaction from the audience during our session was awesome and we had so many questions in the Q & A section that we ran out of time!


The rest of the day was spent watching and learning from some of the other presentations and later that night, it was time to chill at the speakers and sponsors party - which was hosted at Club Mio and included a top quality open-air barbecue dinner.


Due to some very suspect MVP dance moves, I'll keep the after-hours nightclub photo's away from the internet!

Day 2
On the second day, I took in a cool session from the dynamic Stefan duo (Stefan Roth & Stefan Koell) before heading over to Bob Cornelissen and Savision's session titled 'Prepare for Hybrid Monitoring - SCOM 2012, SCOM 2016 and OMS'.  Due to an unexpected hospital visit, Bob had initially asked me to be on standby to take over and deliver this presentation as he wasn't sure he'd even make it over to Berlin but, like a true professional, he stood up on stage and rocked it!

Following this, myself Cameron and Jan Vidar Elven headed over to the 'Ask the Experts' booth to host questions from attendees related to System Center 2016.

Later that afternoon me and Cameron were back on stage again for our  'Using your OpsMgr skills to hack OMS' session. This was another well attended presentation (especially considering the SquaredUp guys were doing an impromptu Whiskey Tasting in the room next door!

When the final sessions of the day finished up, the conference venue played host to the SCU Networking Party where the mix of geek-talk, cocktails and music was rampant.



Day 3
With more than a few sore heads on the last day from the parties the previous night, I had my final presentation to deliver at the opening 09:15 time-slot. This session was titled 'What's New with OpsMgr 2016' and in it I covered all of the new features and enhancements that we can look forward to with the latest release of our favourite monitoring platform. Again, this was another session that had great interaction from the attendees.

An interesting method of gauging attendee feedback for each session was the 'Happy or Not' button stand that was positioned outside the door of each breakout session.


The results from this new rating system were uploaded and sent to speakers within a few days of the conference and thankfully the sessions I participated in where well received.

When I finished my last presentation, it was time to finally chill out a bit after prepping and rehearsing for most of the week and just after lunch a few of us decided to do some final sight-seeing before heading home the following day.

Here's me with my Ergo buddy Gareth checking out some of the amazing architecture around Berlin..


We also stumbled across what we initially thought was a normal Microsoft Store (similar to the ones in the US) however, when we went inside it was a strange bar/café combination so of course, we stopped here for some light refreshments!


When we were finished in the Microsoft bar, we headed on to the famous Brandeburg Gate (one of the top historical tourist attractions in Germany).



Closing Announcements
After the sight-seeing, we headed back to the conference for the closing keynote where some significant announcements were made relating to the future of SCU Europe.

The first announcement was that the conference would be back again in Berlin next year - which is a decision that has gone down very well with speakers and attendees.

The second announcement was that SCU Europe would be re-branded to Experts Live Europe...


I think this is a sensible decision as the conference has morphed into so much more than just System Center. Yes we will have a new release of System Center 2016 coming shortly but when you consider the amount of content discussed at this years conference on Azure and OMS technologies, it just makes perfect sense.

All-in-all, it was a great week and I'm really looking forward to heading back to Berlin next year to what has now turned into Europe's premier community conference for Microsoft IT pro's and geeks.