Skip to main content

opalis: guidance on troubleshooting failed workflows

as you move into deeper integration stories with opalis, it’s probable that you’re going to run into situations where the expected outcome isn’t quite as you dreamed.

now why is this?  it’s usually because as you write the process, it isn’t completely determined what you’ll need.  this manifests itself often for me (anyway) as security related problems.  the most common reason this occurs is that you are designing as your own account and running your workflows as the opalis action service account.

so, if you would, allow me to offer a little guidance on this.

  • repeat to yourself: i am not the opalis action account.  this has a profound effect in separating you from your delusion that the universe does not want you to succeed today.  the point really is to remember that that testing console runs as the user launching the opalis client.  here's a demonstration by pete.
  • review your audit history.  it's object specific but sometimes you can glean problems occurring by viewing the audit history such as access denied errors when using the AD object.

SNAGHTML251cf4cc[4]

  • check the system logs.  while the logs can be quite noisy, it can provide a lot of value if the object problems are being logged.  how do you know if they are?  check!  this post has the log locations.  the policymodule*.log files are the ones to read.
  • minimize confusion by documentation.  document the permissions required for the action server with passionate fervor!  (duh)  i can't even begin to explain how many times i was changing things in the wrong place.
  • measure once, cut twice or measure twice, cut once – just measure.  go back and assess the various places where a credential conflict might cause problems.  generally, when connections are defined, you define the credentials as well to connect to those environments.  while these may execute as the defined account (connections to opsmgr for example), other objects that were not defined will execute as you (in the testing console) and as the opalis action service account (in running policies).
  • put on your running shoes.  if you’re working with nested workflows, chase your log histories!  review the log histories of all your workflows.  while not as valuable as being able to run them in the testing console, it still pulls back some data that can be useful.
  • execute each workflow individually if required.  sometimes log history isn't enough.  it can be useful to separate the workflows and run them through the testing console.  remember your identity crisis.  you're running these tests as YOU, not the action server.
  • append line object is your friend.  back when we scripted in that archaic language known as vbscript, there was this concept of echoing.  you chiseled commands like wscript.echo "i am so cool" into a rock wall and inside of your cmd shell, out it would come (and coincidentally, inside the cmd shell is also where you're cool).  this is akin to using echo statements to capture places that you are in your script.  write to log.  log often.  log is your copilot.
  • be something you’re not.  try running the opalis client as the action account. this presents its own challenges though as you will be required to grant the action account access to your policies.
  • violate all security principles.  where at all possible, just put the action account into the domain administrators and rid yourself of all the pesky permissions issues.  I AM JOKING!  do not ever do this.

for real world context, i wish to say that i ran into every one of these things.  and probably twice.  (except the last one.  thought about it.  a lot.  not dumb enough to do it though).

Comments

Popular posts from this blog

using preloadpkgonsite.exe to stage compressed copies to child site distribution points

UPDATE: john marcum sent me a kind email to let me know about a problem he ran into with preloadpkgonsite.exe in the new SCCM Toolkit V2 where under certain conditions, packages will not uncompress.  if you are using the v2 toolkit, PLEASE read this blog post before proceeding.   here’s a scenario that came up on the mssms@lists.myitforum.com mailing list. when confronted with a situation of large packages and wan links, it’s generally best to get the data to the other location without going over the wire. in this case, 75gb. :/ the “how” you get the files there is really not the most important thing to worry about. once they’re there and moved to the appropriate location, preloadpkgonsite.exe is required to install the compressed source files. once done, a status message goes back to the parent server which should stop the upstream server from copying the package source files over the wan to the child site. anyway, if it’s a relatively small amount of packages, you can

How to Identify Applications Using Your Domain Controller

Problem Everyone has been through it. We've all had to retire or replace a domain controller at some point in our checkered collective experiences. While AD provides very intelligent high availability, some applications are just plain dumb. They do not observe site awareness or participate in locating a domain controller. All they want is the name or IP of one domain controller which gets hardcoded in a configuration file somewhere, deeply embedded in some file folder or setting that you are never going to find. How do you look at a DC and decide which applications might be doing it? Packet trace? Logs? Shut it down and wait for screaming? It seems very tedious and nearly impossible. Potential Solution Obviously I wouldn't even bother posting this if I hadn't run across something interesting. :) I ran across something in draftcalled Domain Controller Isolation. Since it's in draft, I don't know that it's published yet. HOWEVER, the concept is based off

sccm: content hash fails to match

back in 2008, I wrote up a little thing about how distribution manager fails to send a package to a distribution point . even though a lot of what I wrote that for was the failure of packages to get delivered to child sites, the result was pretty much the same. when the client tries to run the advertisement with an old package, the result was a failure because of content mismatch. I went through an ordeal recently capturing these exact kinds of failures and corrected quite a number of problems with these packages. the resulting blog post is my effort to capture how these problems were resolved. if nothing else, it's a basic checklist of things you can use.   DETECTION status messages take a look at your status messages. this has to be the easiest way to determine where these problems exist. unfortunately, it requires that a client is already experiencing problems. there are client logs you can examine as well such as cas, but I wasn't even sure I was going to have enough m