SCOM KB: You get a CyclicDependencyException when querying Management Pack information using the Powershell or SDK interfaces

May 2, 2016 at 8:38 am in Uncategorized by Jan Van Meirvenne

Symptom:

When you attempt to use the get-scommanagementpack cmdlet to retrieve SCOM management pack info (or SDK equivalent), you get the following error:

Exception of type ‘Microsoft.EnterpriseManagement.Common.CyclicDependencyException’ was thrown.

 

Cause:

This error means that there at least 2 management packs in the SCOM management group that depend on each other (Cyclic Dependency). This situation is not supported (but does not break functionality of the management packs themselves), and should be rectified if possible.

Resolution:

You can use the following SQL query on the SCOM operational database to identify cyclic MP pairs:

SELECT mpas.MPName +
‘ <-> ‘
+ mpbs.MPName as Cycle


FROM [OperationsManager].[dbo].[ManagementPackReferences] a


inner
join [ManagementPackReferences] b on a.ManagementPackIdReffedBy= b.ManagementPackIdSource


and a.ManagementPackIdSource = b.ManagementPackIdReffedBy inner
join ManagementPack mpas on a.ManagementPackIdSource = mpas.ManagementPackId inner
join ManagementPack mpar on a.ManagementPackIdSource = mpar.ManagementPackId inner
join ManagementPack mpbs on b.ManagementPackIdSource = mpbs.ManagementPackId

 

This will show the cycle MP’s:

You can then use this information to either contact the MP manufacturer for a bug submission, or remove the cyclic reference from your code if you have built the MP’s.

After removing the cycle, the exception should not occur again.

 

How stuff is discovered by Operations Manager, and how you can remove it

February 2, 2016 at 9:43 pm in Uncategorized by Jan Van Meirvenne

SCOM is a very extensive platform, able to handle and maintain a vast inventory of monitored objects. This is done using a model-based database where various relationships and attached workflows (monitors and rules) will cause the entire monitoring function to operate correctly. The way in which objects are created or deleted in this inventory can be two-fold:

Discovery-based

This is the most common way to control SCOM’s inventory. A discovery is a workflow defined in a management pack which function is to detect a certain condition on a machine (using the SCOM agent) and if true, report the collected data back to the SCOM management server, which will create an object based on a certain class. For day-to-day authoring this is the goto mechanism as it is very straightforward and does not require extensive programming knowledge to accomplish. Discoveries can use Registry, WMI, Powershell and other types of data sources to determine whether a condition is present or not.

Pro’s:

– Easy to understand

– Open in nature, modifications are possible through overrides

Cons:

– Complex discoveries (dense application topologies) require extensive Powershell scripting, which might eat away resources

Connector-based

This is a more complex approach to control inventory. There are inbound and outbound connectors within Operations Manager. An outbound connector can send alerts to other management platforms (eg send Operations Manager alerts to Service Manager for incident creation). An inbound connector is not used to import exclusively alerts from other systems, but also to allow other systems to push an entire health model altogether. The way it works is that you import the management pack that comes alongside with the connector, and then setup the connector itself (this is usually done from the system that is connecting to Operations Manager). The program logic in the other system and connector will then use the structure which was defined in the management pack to create objects, set monitor states, input perdormance data,… It is a very powerful mechanism as it allows you to connect another platform to SCOM and have that platform do all the probing and collection while SCOM is just used to store and visualize that data. It prevents ‘wheel-reinventing’ and allows the use of the best tool for the job. The prime example for this method is the Operations Manager / Virtual Machine Manager integration: in that scenario you use SCVMM to first import some required management packs (the structure) and then setup multiple connectors which are used to push SCVMM’s inventory and corresponding health state into SCOM.

Pro’s:

– Instead of having SCOM collect the data itself in addition of the connected platform, have that platform push its own inventory and metrics instead. This saves resource cycles.

Cons:

– The connector developer is responsible for the amount of data and the interval at which it is being sent. There is no override mechanism to control the behavior.

– The connector’s inner workings is not open in nature. Figuring out why a certain health state is pushed can be difficult to find out.

image

 

How can I find out if an object was added by a discovery or a connector?

There are some tables in the OperationsManager database that can be used to exactly know who added which object when.

BaseManagedEntity – this table contains the base properties (id, type, path, …) of all objects that exist within the management group
ManagedType – this table contains all object classes
TypedManagedEntity – this table links the objects in the BaseManagedEntity table to the classes in the ManagedType table
DiscoverySource – this table holds the information for each TypedManagedEntity which Connectors and/or Discoveries have contributed to the existence and status of the TME
DiscoverySourceToTypedManagedEntity – This table links the various sources in the DiscoverySource table (multiple discoveries or connectors can be the source for a single object) to the entities in the TypedManagedEntity table

By joining these tables together in a query, one can get a pretty good overview of the links between the discoveries / connectors and the objects:

select
bme.BaseManagedEntityId as ‘ObjectID’,
bme.FullName as ‘ObjectFullName’,
source.BaseManagedEntityId as ‘HostObjectId’,
source.FullName as ‘HostObjectFullName’,
case ds.DiscoverySourceType when 1 then ‘Connector’ when 0 then ‘Discovery’ when 3 then ‘Singleton’ end as ‘ObjectCreationType’,
case ds.DiscoverySourceType when 1 then c.ConnectorId when 0 then d.DiscoveryId end as ‘ObjectCreationId’,
case ds.DiscoverySourceType when 1 then bmec.DisplayName when 0 then d.DiscoveryName end as ‘ObjectCreationName’,
d.DiscoveryTarget,
ds.TimeGeneratedOfLastSnapshot as ‘LastModified’
from
BaseManagedEntity bme
inner join
TypedManagedEntity tme on tme.BaseManagedEntityId = bme.BaseManagedEntityId
inner join
DiscoverySourceToTypedManagedEntity dstme on dstme.TypedManagedEntityId = tme.TypedManagedEntityId
inner join
DiscoverySource ds on ds.DiscoverySourceId = dstme.DiscoverySourceId
left join
BaseManagedEntity source on source.BaseManagedEntityId = ds.BoundManagedEntityId
inner join
ManagedType mt on tme.ManagedTypeId = mt.ManagedTypeId
left join
Discovery d on ds.DiscoveryRuleId = d.DiscoveryId
left join
DiscoveryClass dc on d.DiscoveryId = dc.DiscoveryId
left join
ManagedType target on target.ManagedTypeId = dc.ManagedTypeId
left join
Connector c on ds.ConnectorId = c.ConnectorId
left join
BaseManagedEntity bmec on c.BaseManagedEntityId = bmec.BaseManagedEntityId

order by bme.FullName

This shows the following information for each discovery source:

ObjectId The id of the monitoringobject that was discovered
ObjectFullName the full name (type + name) of the monitoringobject
HostObjectId The id of the object that is hosting the monitoringobject
HostObjectFullName the full name (type + name) of the hosting monitoringobject
ObjectCreationType The type of discovery source (Connector = Connector-based,Discovery = Discovery-based,Singleton = static object like groups or distributed applications, which have no discovery)
ObjectCreationId The id of the connector or discovery
ObjectCreationName The name of the connector or discovery
DiscoveryTarget In case of a discovery-based entry, shows the class that the discovery targets
LastModified The last time a change was propagated from a discovery-source. This timestamp only changes when the object changes (creation or attribute update)!

 

How do I prevent a certain object from being discovered?

Well, with regular discoveries it is easy: just disable the discovery for the system you do not want to be probed for the existence of a certain type and you’re done.

Depending on the scope you want to exclude, you need to make sure you are targetting the correct level.

image

image

As an example: lets say you have a monitoring type ‘FinanceApp’ which represents a software installation of a finance platform. The type is associated with a discovey ‘FinanceApp Discovery’ which is targetted at the ‘Windows Computer’ class.

If you do not want to add any FinanceApp instances to your management group at all for the moment (you’re preparing overrides on the monitoring part of the management pack for example), you can create an override that sets ‘enabled’ to ‘false’ for the discovery on ‘all objects of class: Windows Computer’. This will disable the discovery globally and will prevent any instances from being detected and created in SCOM.

If you have a development farm where a lot of developers torture the Finance App constantly (developers are cruel beings, we all know it), and you do not want to have constant alerts and error-states from polluting your monitoring landscape, you can create a group where you add all Windows Computer objects that are deemed as development system and then create an override that sets ‘enabled’ to ‘false’ on ‘for a group…’. It is very important that you are using the Windows Computer type as membership type, as this is the target type of the discovery. Choosing a lower level type (eg logical disk) will not work. However, choosing a higher level type does (disabling a discovery with target ‘logical disk’ on a group consisting of Windows Computer instances will result in all logical disks contained in those Windows Computer objects being affectected).

If you have a single development server then you can just use the ‘For a specific instance of type “Windows Computer” and then choose an instance from the list.

image

For a connector, things become harder. Unless the connected system supports modification of the amount of data that is being synced, you will just need to go with the flow on that part.

Well, too late, I got some stuff in my management group I do not want. How do I remove it?

Again, the answer depends on how the data was inserted in the first place.

For items created by a discovery, you can just disable the discovery using the information provided in the previous section. You can then open up the Operations Manager PowerShell interface and run the command Remove-SCOMDisabledClassInstance. This will prompt you with a warning that this is a database-heavy operation (and it actually is, so take care). If you choose to continue, SCOM will effectively scrub all objects and relationships that have disabled discoveries.

image

For items created by a connector, you should use the connecting system to correctly shutdown the connectors (eg the SCVMM PowerShell command ‘Remove- SCOpsMgrConnection’ to remove the SCVMM integrations). Should this not be possible for some reason you can just delete the connector by again using PowerShell:

Get-SCOMConnector –DisplayName ‘<the name of the connector you want to remove>’|Remove-SCOMConnector

Simulated example with a SCVMM connector:
 

image

Note: NEVER REMOVE THE ‘MOM Internal Connector’! This connector is used by SCOM itself to create objects in the database. Deleting this connector would mean certain death for your management group.

Ok, I tried this, but still no cigar!

There are some gotcha’s that exist when dealing with the SCOM inventory:

1) An object is only removed if ALL discovery sources are either disabled or report that the condition that evaluates if an object should be created equates to ‘false’.

Lets say the FinanceApp has 2 discoveries: one that checks for a simple detection of the FinanceApp Windows Service for basic object creation, and another one that uses a PowerShell script to looks up the version information to populate the object’s version property. Only if both these discoveries are disabled before running the PowerShell cleanup or both have run and both report that there is ‘nothing to discover’ will the object be removed.

This is only the case for discoveries on the same level. If there are discoveries that target the FinanceApp object itself (eg to discover and monitor sub-components), they will recursively follow the deletion of the upper-level ones.

image

2) Sometimes SCOM doesn’t clean up after itself

I have seen this happen from time to time in 2 situations:

– The SCOM infrastructure is under pressure or fails at a bad time, causing an internal cleanup of objects to not come through completely

– Somewhat exotic, but when you use the SCOM SDK to update objects that were created initially through a discovery and you don’t create a dedicated connector, the ‘MOM Internal Connector’ is used. Since you can not remove this connector you’re basically screwed as the object now has a discovery source that is linked to a connector that will never be deleted, giving the object eternal life.

Some blogs mention using the ‘IsDeleted = 1’ trick to remove objects directly from the database. This will trigger a database workflow that recursively cleans up the object itself and its related descendants. This procedure is not supported by Microsoft however.

During a support call a Microsoft engineer mentioned that using the SDK is actually supported to do this. So I took some code from a Service Manager SDK example which demonstrates to delete workitems in order to make it work on SCOM objects (after all, technically they are the same things).

I put this in a custom PowerShell module which I will extend with more reusable scripts in the future.

This module autoloads the regular OperationsManager module, and will also load the SDK DLL files. For now, you need to have the SCOM console installed on the machine where you are using this.

To remove an object, you can use the module like this:

Import-Module <path to module>\CustomSCOM.psm1
New-SCOMManagementGroupConnection –ComputerName <name of a SCOM management group server>
Get-SCOMClassInstance –Id <Id of the object to remove>|Remove-cSCOMObject

This will leverage the SDK to create an incremental discovery data array, where the object will be added to as ‘needing removal’. This array will then be committed to the management group, which will mark the object as deleted in the database. Same result as the IsDeleted trick, but way easier to achieve and if really needed automate, cooler and more supported!

Please note that this is a very powerful function (think ‘the one ring’ powerful) that can destroy a management group if used incorrectly. Make sure to read its

prompts well to prevent any unintentional deletions. And of course, use at your own risk Smile.

Well, that’s it for this week. Until the next one! Jan out.

ps Feel free to reach out to me on  @JanVanMeirvenne or jan@jvm-net.com in case of feedback on this topic

Service Manager: finding and fixing views with invalid criteria

January 25, 2016 at 8:24 am in Uncategorized by Jan Van Meirvenne

 

With every Service Manager project, there are some free addons I almost always implement together with the platform. Posting an overview blog with a ‘best of’ overview of these tools is on my to-blog list (which gets usually longer than shorter over time alas).

One of these tools is the Advanced View Editor, a small addon that allows you to define views much more specific (column names, more criteria) than the standard view editor. There is a pro version which contains even more features, but usually the free edition is sufficient for the job.

The tool has the useful feature to allow direct XML editing of the criteria instead of using the common GUI:

clip_image001clip_image002

However, there is one downside to this: should the criteria of a view become invalidated (usually an enumeration that is deleted but is still specified in the view), then the editing mode will only allow XML-based modifications. This can be confusing if the service administrator is not familiar with XML, especially as it is not clear why the view is deemed invalid. The regular view editor (built-in in the console) can show the missing enumeration, but doesn’t work well with views created by the addon (I usually recommend to only use the AVE editor or the regular one, but not both).

The best way to prevent such an issue is to get your service management processes straight before implementing them in SCSM. Changing workitem categories (incident area / resolution / …) wreak havoc on both reporting and scoping and thus poses a severe impact on the efficiency of the platform. However, these cases can not always be prevented as company organisations and politics change over time, bringing new structures in play that were not foreseen during the initial design of the processes.

How do you know I am impacted by this issue? Well, when you open a view in the AVE, you will get this message when you go to the criteria-section, not allowing you to use the GUI-mode editor:

image

Remember however, you will notice the issue faster with the AVE addon, but a regular view will be impacted just the same, showing a blank field in the criteria-GUI were an invalid item is present.

How do you fix it? Well, you’ll need to check the XML and identify which enumerations are used in the criteria. An enumeration is an item based in a SCSM list (eg Incident Area), and is represented internally as a GUID.

 

image

You then need to use the Powershell-based smlets to connect to the SCSM management and enter the following command: ‘Get-SCSMEnumeration –Id ENUMERATIONIDFROMXML’

Once you get an error, or nothing returned, then you found the culprit of your problem. However, take into account that there can be multiple missing enumerations for a view, so it is best to check each one before considering the job done. when you identified all missing enums, then you need to delete them from the XML-code. Keep in mind that if you have an and/or clause in your criteria then you might need to remove the <and>/<or> expression and replace it with a regular expression.

image

would become

image

‘That’s all nice and sweet, but do you really expect me to go over each view and check if I am impacted? I have over a hundred views!!!’

Well, first, you’re views will keep working, so there is no immediate threat to the end-user experience. So this only becomes a problem when you need to perform changes to the views. However, since you once set the criteria you expect a certain set of workitems to return. When that criteria becomes invalid some work-items that you need to show up in a certain view might just not be there, increasing the risk of having a blind spot. A blind spot is something you don’t want in a service desk tool!

So, to allow you to quickly pinpoint impacted views, I created a script that scans all views that have a criteria, loops over all the guids and tests that they exist as an enum in the Service Manager lists. It then outputs any views that need fixing in the following manner:

Missing guid ‘{a12f6e91-d223-4741-3ca9-41088c314b84}’ in view ‘Microsoft Outlook’ of type ‘Problem’ in folder-path ‘Work Items/Problem Management – SEC/Problem per Service/Communication/’

Ok, nice! Now you not only know which views are impacted, but also their relative location in your console-structure! It is still monkey-work to fix the views, but for now I won’t add a ‘autofix’ feature in the script as it is hard to intelligently perform without risking breaking the views alltogether.

You can find the script in my github account: https://github.com/JanVanMeirvenne/SCSMRunbooks/blob/master/Runbooks/Get-cSCSMInvalidView.ps1

Note: you need to have smlets (and thus the SCSM console) installed on the computer from where you want to run the script + the necessary credentials to access the SCSM management group.

Feel free to contact me by mail (jan@jvm-net.com) should you have any questions on this matter!

Untill next time! Jan Out.

OMS overview: chapter 2 – disaster recovery

November 2, 2015 at 7:23 am in #msoms, #sysctr by Jan Van Meirvenne

OMS Blog Series Index

Since there is much to say about each of the Operations Management Suite aka OMS services I will break up this post in a blog series:

Chapter 1: Introduction to OMS
Chapter 2: Disaster Recovery with OMS (this post)
Chapter 3: Backup with OMS
Chapter 4: Automation with OMS
Chapter 5: Monitoring and Analysis with OMS
Chapter 6: Conclusion and additional resources overview

This series is actually a recap of a live presentation I gave on the last SCUG event which was shared with another session on SCOM – SCCM better together scenario’s presented by SCUG colleagues Tim and Dieter. You can find my slides here, demo content that is shareable (the ones I don’t need to pay for Smile) will be made available in the applicable chapters.

Disaster Recovery is one of the bigger money sinks of IT. First you need yourself a big secondary datacenter able to maintain your business in case your primary one bites the dust. Not only you need to throw money at something you might never need in your entire career, but even have to invest in setting up DR plans for every service you are planning to protect. This usually requires both additional design and implementation work, and separate tooling that allows you to perform the DR scenario you envisioned. Especially in the world of hybrid cloud, protecting services across platform boundaries might seem like a complex thing to do: needing to integrate multiple platforms in preferably one single DR solution.

Meet Azure Site Recovery

Azure Site Recovery or ASR provides 2 types of DR capabilities:

  • Allowing the replication and failover orchestration of services between 2 physical sites of your own
  • Allowing the replication and failover orchestration of services between a main site that you own and the Azure IaaS platform 

Bear in mind that while the platform is advertised as a DR solution, it is also possible to use it as a migration tool to move workloads to Azure or other on-premise sites.

The big advantage here is that the solution is platform-agnostic, providing scenario’s to protect virtually any type of IT infrastructure platform you use. The DR site can be a secondary VMWare or Hyper-V (with SCVMM) Cloud, or Azure. Vendor-lock-in is becoming a non-issue this way!

Supported Scenario’s

Here is a full overview of the supported flows:

Infrastructure

To Azure

To an own DR-site

Application

  • SQL Always-On
  • Other application types must be orchestrated by using the recovery plan feature or by doing a side-by-side migration / failover

Architecture

Basically, there are 2 major ‘streams’ within ASR to facilitate DR operations, but in any case you’ll always need a Recovery Vault. The recovery vault is an encrypted container that sits on top on a (selectable) storage account. If the target DR site is Azure, the vault will store the replicated data and use it to deploy Azure IaaS VM’s in case of a failover. If the target DR site is another on-premise site, the vault will only store the metadata needed for ASR to protect the main site. The vault is accessed by the on-premise systems using downloadable vault encryption keys. These keys are used during the setup and are accompanied by a passphrase the user must enter and securely store on-premise. Without this passphrase the vault becomes inaccessible should systems be (re-)attached to the vault, so it is very important to double-, no triple-backup this key!

All communication between the different sites is also encrypted using SSL.

In the case that Azure is the target DR site, you must specify an Azure size for each VM you want to protect along with an Azure virtual network to connect it to. This allows you to control the cost impact.

The Microsoft Azure Recovery Services Agent (MARS) and the Microsoft Azure Site Recovery Provider (MASR)

This setup is applicable to any scenario where Hyper-V (with or without SCVMM) is the source site in the DR plan.

The MARS agent needs to be installed on every Hyper-V server which will take part in the DR-scenario (both source and target). This agent will facilitate the replication of the actual VM data from the source Hyper-V servers to the target site (Azure or other Hyper-V server).

The MASR agent needs to be installed on the SCVMM server(s) or in case of a Hyper-V site to Azure scenario it needs to be co-located on the source Hyper-V server together with the MARS agent. The MASR agent is responsible for orchestrating the replication and failover execution and primarily sync meta-data to ASR (actual replication data is done by the MARS agent).

image

image

The Process, Master Target and Configuration Server

This setup is used for any scenario with a VMWare (with some additional components described later on), Cloud (Azure or other) or Physical site as a source. These are the components which facilitate the Replication and DR Orchestration. Note: they are all IaaS level components.

Process Server

This component is placed in the source site and is responsible for pushing the mobility service to the protected servers, and to collect replication-data from the same servers. The process server will store, compress, encrypt and forward the data to the Master Target Server running in Azure.

Mobility Service

This is a helper-agent installed on all systems (Windows or Linux) to be protected. It leverages VSS (on Windows) to capture application-consistent snapshots and upload them to the process server. The initial sync is snapshot-based, but the subsequent replication is done by storing writes in-memory and mirroring them to the process server.

Master Target Server

The Master Target Server is an Azure-based system that receives replication-data from the source site’s process server and stores it in Azure Blob Storage. As a failover will incur heavy resource demands on this system (rollout of the replica’s into Azure IaaS VMs) it is important to choose the correct sizing in regards of storage (standard or premium) to ensure a service can failover within the established RTO.

Configuration Server

This is another Azure-based component that integrates with the other components (Master Target, Mobility Service, Process Server) to both setup and coordinate failover operations.

Failback to VMWare (or even failover to a DR VMWare site instead) is possible with this topology with some additional components. It is nice to see that Microsoft is really upping the ante in regards of providing a truly heterogeneous DR solution in the cloud!

image

Orchestrating workload failover/migration using the Recovery Plan feature

Of course, while you can protect you entire on-prem environment in one go, this is not an application-aware setup. If you want to make sure your services are failed over with respect of their topology (backend -> middleware -> application layer -> front-end) you need to use the recovery plan feature of ASR.

Recovery Plans allows you to define an ordered chain of VMs along with actions needing to be taken at source site shutdown (pre and post) and target site startup (pre and post). Such an action can be the execution of an automation runbook hosted by Azure Automation, or a manual action to be performed by an operator (the failover will actually halt until the action is marked as completed).


Source: https://azure.microsoft.com/en-us/documentation/articles/site-recovery-runbook-automation/

Failing Over

When in the end you want to perform an actual failover operation you can perform 3 types of actions:

– Test Failover: this keeps the source service/system online while booting the replica so you can validate it. Keep in mind that you should take possible resource conflicts (DNS, Network, connected systems) into account.

– Planned Failover: this makes sure that the replica is fully in-sync with the source service/system before shutting it down and then boots the replica. This ensures no data loss occurs. This action can be done when migrating workloads or protecting against a foreseen disaster (storm, flood,…) the protected service will be offline during the failover

– Unplanned Failover: this type only brings online the replica from the last sync. Data loss will be present as a gap between the failure moment and the last sync. This is only for instances where the disaster already occurred and you need to bring the service online at the DR ASAP.

A failover can be executed on the per-VM level or via a recovery plan.

Caveats and gotcha’s

Although the ASR service is production-ready and covers a lot of ground in terms of features, there are some limitations to take into account:

Here are some of the bigger limits:

When using Azure as a DR site

– Azure IaaS uses the VHD format as storage disk format, limiting the protectable size of the VHD or VHDX (conversion is done automatically) to 1024Gb. Larger sizes are not supported
– the amount of per-VM resources (CPU Cores, RAM, Disks) is limited by the supported resources provided by the largest Azure IaaS sizing (eg if you have 64 attached disks in your on-prem you might not be able to protect it if Azure’s maximum is 32)

Overall Restrictions

– Attached Storage setups like Fiber Channel, Pass-through Disks or iSCSI are not supported
– Gen2 Linux VMs are not yet supported

This looks nice! But how much does it cost?

The nice thing about using Azure as a DR-site is that you only pay a basic fee for the service, including storage and WAN traffic, but only pay the full price for IaaS compute resources when an actual failover occurs. This embodies the concept of ‘Pay what you use’ that is one of the big benefits of public cloud. Even better: you only start paying the basic fee after 31 days. So in case you would use ASR as a migration tool (moving workloads to the cloud or another site) you will have a pretty cost-effective solution! Bear in mind that used storage and WAN traffic is always billed.

I won’t bother to list the pricing here as it is as volatile in nature as the service itself. You can use the Azure pricing calculator to figure out the costs.

image

If you have one or more System Center licenses, check out the OMS suite pricing calculator instead to asses if you can benefit from the bundle-pricing.

Ok, I’ll bite, but how do I get started?

The service for now is only accessible from the ‘old’ Azure Portal on https://manage.windowsazure.com

Log in with an account that is associated with an Azure subscription, and click the ‘new’-button in the bottom-left corner.

image

Choose ‘Data Services’ -> ‘Recovery Services’ -> ‘Site Recovery Vault’

image

Click ‘Quick Create’ and then enter a unique name and choose the applicable region where you want to host the service. Then, click ‘Create Vault’.

image

This will create the vault from where you can start the DR setup

image

When the creation is done, go to ‘Recovery Services’ in the left-side Azure Service bar and then click on the vault you created.

image

The first thing you must do is to pick the appropriate scenario you want to execute

image

This will actually provide you with a tutorial to set up the chosen scenario!

image

To re-visit or change this tutorial during operational mode, just click the ‘cloud icon’ in the ASR interface

image

I won’t cover the further steps needed as the tutorials provided by Azure are exhaustive enough. I might add specific tutorials later on in a dedicated post in case that I encounter some advanced subjects.

 

Final Thoughts on ASR

While I am surely not a data protection guy, setting this puppy up was a breeze to me! This service, which is now part of OMS, embodies the core advantages of cloud: immediate value, low complexity and cross-platform. I have already seen several implementations, confirming that this solution is here to stay and will likely be a go-to option for companies looking for a cost-effective DR platform.

Thanks for the long read! And see you next time when we will touch ASR’s sister service Azure Backup! Jan out.

OMS overview: chapter 1– introduction

November 1, 2015 at 10:49 pm in Uncategorized by Jan Van Meirvenne

Hi all!

Before I dabble you under in the realm of Microsoft cloud platform management, I first want to bother you with some personal announcements!

It has been a long time time since I posted consistently for a while now, but I have some perfect excuses for this:

First off, I got married all the way back in April to my now wonderful wife Julie!

image

And if that was not enough of a life achievement, meet my son Julian, born in August (yes, Julie is the mother Winking smile)! This is one of the rare pictures where he smiles because he likes to keep things very serious generally.

image

And to complete this combo-high score, we will soon be on the lookout for our own home where we can develop ourselves as a family and live happily ever after!

Despite all of this I am still dedicated to acquire, produce and share knowledge regarding the Microsoft cloud technologies and while new balances will of course need to be sought, I pledge to continue this passion both on- and offline, whether at customer sites or community events! So let’s kick the tires again and start off with an introduction to the newborn cloud management platform, Operations Management Suite!

OMS Blog Series Index

Since there is much to say about each of the Operations Management Suite aka OMS I will break up this post in a blog series:

Chapter 1: Introduction to OMS (this post)
Chapter 2: Disaster Recovery with OMS
Chapter 3: Backup with OMS
Chapter 4: Automation with OMS
Chapter 5: Monitoring and Analysis with OMS
Chapter 6: Conclusion and additional resources overview

This series is actually a recap of a live presentation I gave on the last SCUG event which was shared with another session on SCOM – SCCM better together scenario’s presented by SCUG colleagues Tim and Dieter. You can find my slides here, demo content that is shareable (the ones I don’t need to pay for Smile) will be made available in the applicable chapters later on.

Chapter 1: Introduction

So what is OMS? Well, it is the management-tool answer to the hybrid cloud scenario

image

The hybrid cloud scenario entails a synergy between both your on-premise platforms, Microsoft cloud technologies like Azure and O365 and any 3rd party cloud platforms you might consume like Amazon for example.

This kind of ‘cloud of clouds’ is emerging everywhere and there are many companies that are already using cloud-based services today. This scenario provides a highly elastic and flexible way of working: quickly spin up additional business app instances in the cloud on-demand or have a full DR site ready to go with the push of a button, all with just the swipe of a credit card, and this are just 2 examples! However, a nice quote I read somewhere comes into play: ‘As the thing become easier on the front-end, as hard do they become on the back-end’. Essentially, things are easy when they are just contained on one platform and in one place. The hybrid cloud smashes this ideal by dictating that services should be deliverable through any platform from anywhere. This raises the important question: how do I distribute my services across all these platforms running in various locations while still being able to have my single pane of glass?

If you answer ‘System Center’ you are right, but partially, due to the following reasons:

– The System Center tools were born for on-prem management and although they can interface with cloud and cross-platform technology, their core platform is and will for now remain Windows Server, an on-prem platform.

– One of the goals of the hybrid cloud is to achieve hyper-scale: being able to spin up service instances in a matter of seconds. The kind of data that needs to be managed might be overwhelming for the current System Center tools. Have you checked your SCOM DW size and performance recently? Or did performance tests on your Orchestrator runbooks on high-volume demands? Or tried managing both Azure and Hyper-V VMs as a single unit? Don’t get me wrong, they can cope, but as these platforms were designed for on-prem scenario’s they can not always provide the hyper-elasticity and agility that their targets impose nowadays.

‘So what are you trying to say? That System Center is becoming an obsolete relic?’ Hell no! But just as the managed platforms evolve, so must the management ones do! This is why OMS has been developed, to close the gap, and not replace but extend the System Center story into the cloud.

Actually OMS is nothing new under the sun. It is just like its older brother EMS (cloud-based workplace management) a competitively priced bundle of Azure-based services which form a single management platform together. When you open the OMS site (www.microsoft.com/oms, one of the easiest URLs ever) for the first time you’ll see this:

image

Pretty abstract, I agree, but in fact the technology hiding behind these concepts are very simple and straightforward:

image

MAPPING

Concept Technology
Backup & Recovery Azure Backup / Azure Site Recovery
IT Automation Azure Automation
Log Analytics OMS (previously known as Microsoft Operational Insights)
Security & Compliance OMS (previously known as Microsoft Operational Insights)

 

As you see, both well and lesser known Azure services which are already in existence for quite some time power the platform. What I do like about bundling this in one suite is that the services are placed in a broader, general concept of management and are priced and presented in a much more coherent way! Just like System Center is a suite of separate software platforms, so are their Azure counterparts now!

In the next posts to come I will attempt to provide a thorough overview on each of these services and provide some scenario’s where they might be positioned perfectly in your environment.

Thanks for giving this post a read and I hope to catch you later! And should you have questions or remarks, don’t hesitate to provide me feedback!

Jan out.

Citrix NetScaler Management Pack Addendum

August 7, 2015 at 11:54 pm in Uncategorized by Jan Van Meirvenne

I have created an addon MP which adds some in-depth monitoring for Citrix NetScaler. It adds support for more in-depth for Virtual Server, Service and Service Group monitoring. All info here: http://www.jvm-net.com/?p=1446

Service Manager: get workstation from where a request / incident was logged

July 6, 2015 at 7:16 pm in Uncategorized by Jan Van Meirvenne

Recently I got the question if the machine used to log a request or incident through the SCSM portal could be used in an automated runbook. The catch was: both end-user input or the SCCM primary user feature was not possible.

After some looking around and tinkering I got a decent alternative, which you can find here.

SCOM data warehouse troubles #2: The missing objects

June 15, 2015 at 7:40 pm in Uncategorized by Jan Van Meirvenne

The previous week I noticed that my customer’s reports were missing a lot of data in terms of recently added servers and their underlying objects. turns out they didn’t exist in the data warehouse at all, while they were certainly a couple of days old.

I troubleshooted the issue and found that there was a conflict between 2 tables in the data warehouse, effectively blocking the entire syncing process of SCOM!

You can read my adventure including the happy ending here

Service Manager: hiding the default incident offering from the portal

May 14, 2015 at 6:05 pm in Uncategorized by Jan Van Meirvenne

This week I was asked how one can remove the default incident offering. This might be important if a company wants to make sure a certain set of information is entered with each incident.

Although this seemed simple to do, it wasn’t that easy.

You can find the full explanation here: http://jvm-net.azurewebsites.net/?p=1421

‘Web Management service is stopped’

March 16, 2015 at 12:29 pm in Uncategorized by Jan Van Meirvenne

There is a small bug in the IIS 7.5 Management Pack which might cause false alerts of the type ‘Web Management service is stopped’ to show up. I have written a short blog post on how to tackle this bug, including an example: link