Vol. 21, #34 - August 22, 2016 - Issue #1094
Earlier this year we fielded an Ask Our Readers call for help from a reader named Darby who was having problems with a terminal (RDS) server that has periodically been refusing new connections from remote clients. Some of our readers tried to help by offering various suggestions but the problem persisted and I kept in touch with Darby over the next few months tracking developments. As of now the underlying problem remains unresolved but a workaround is in place that keeps the server fulfilling its business purpose, and this raised a question in my mind, namely: when is an IT solution "good enough" from a business perspective? This is the topic we're going to explore in this week's issue of WServerNews and we hope it will spark your interest enough for you newsletter readers to offer your own comments to us on this subject by emailing us at firstname.lastname@example.org
Speaking of being good enough, when is a tech startup good enough for a venture capitalist to want to invest in its future development? Check out the Wally criteria in this classic Dilbert comic strip:
Ask Our Readers: WServerNews has almost 100,000 subscribers worldwide. That's a lot of expertise to tap into. Do you need help with some issue or need advice on something IT-related? Got a question you'd like us to toss out to our readers to try and answer? Email us at email@example.com
Last month in Issue #1089 Tech support scams your Editor shared a story of how he was almost (but not really) tricked into a tech support scam he had not previously come across. We received a ton of feedback from our readers on that topic and we published a selection of it in Issue #1092 Reader feedback: Tech support scams for the benefit of our other readers out there. This week one more piece of feedback came in on this topic and it was so good that we just had to publish it. This story was submitted by Ari who runs a company that provides IT support services for businesses right here in Winnipeg, Canada where your Editors live and work:
My son took a call the other day from "Microsoft Support" calling to let us know that we had a problem with our computer. Of course he's well aware of the scam and doesn't fall prey to it. Instead, he offered "Sorry, but we have a Mac." We do (as well) but that's not the point. What happened after that just made me raise my eyebrows and laugh.
The 'tech support representative' (with a thick 'foreign' accent) replied: "Oh. I'll transfer you to our Apple division."
Waitwut? Apple support division at Microsoft? Even my son had to laugh to about that. "You're calling from Microsoft, right? When did Microsoft get into Mac support?"
Thought I'd share that with you, if for no other reason than to make you smile.
Sounds like those scammers are pretty desperate, doesn't it?
And now let's move on to the main topic of this week's newsletter. We'll start off by doing a quick a recap of Darby's story and then we'll bring everything up to date so we can all learn some lessons from Darby's experiences...
Way back in April of this year in Issue #1076 Hot desking blues we fielded an Ask Our Readers request for help from a reader named Darby who is a Senior Consultant based in North San Diego County area of California, USA. In his email Darby described a problem he had been experiencing that neither he nor Microsoft Support have been able to successfully troubleshoot:
I am writing this email in hopes that one of your subscribers has run into this problem before as I have been searching every online source I can think of and have been working with MS support for weeks now without a solution. Needless to say, my users are not very appreciative when this server decides to stop accepting new RDP connections.
A brief bit of history...this is a Windows Server 2008 R2 server (completely up to date with all MS Updates) that hosts a couple different back office solutions...one is delivered via RDP and the other is web based. This server runs pretty well 24x7 and worked great from early December, when it was created, until about mid February when the first occurrence of this behavior appeared.
When the issue arises, users attempting to initiate an RDP session are able to enter their login credentials no problem, but the session just hangs on the part where the RDP client says 'Initiating Remote Connection' and eventually just times out saying it cannot connect. I have let this condition run overnight to see if it was temporary, but it did not appear to be. Each time this occurs, I must force a reboot of the OS and when the system reboots everything is back to normal and new RDP connections are created just fine.
What's even more perplexing is that the web based app and some scheduled tasks set up under the Windows Scheduler appear to continue to work fine even during the periods where the server will not create an RDP session from the client.
I don't believe the server itself is hung, as these other tasks work, and work the same as normal in terms of performance. But whatever is preventing the establishment of the new RDP session(s) has so far always required a reboot. We have been unable to find anything conclusive in the Terminal Services Event Logs.
Any advice is greatly appreciated at this point.
Two weeks later in Issue #1078 Disk encryption tools we published several responses from readers who had various suggestions concerning Darby's problem. I touched base with Darby several times in the month that followed and we published his updates on his situation in Issue #1082 Catching Up. In his first update Darby said:
Hi Mitch, I saw this in the issue last week and wanted to thank you for all the great ideas. I also wanted to give you a quick update. In drilling into this one more, it appears as though this may be related to the AV software I am using. I eventually found this post after finally discovering a chronological association between the error 4005 in the Event Viewer and the login issue:
When I read this post, it was as if I had written much of it. Only after reading the associated threads down to the bottom did I discover that there was some kind of issue potentially with the Webroot AV software. I then contacted their support and received a special build of the client/agent to install and test. This was deployed last weekend and I am now waiting for, hopefully, no further occurrences.
Since I was seeing this event about once a week I believe I 'should' have some idea by this time next week whether this was the issue. I will certainly let you know.
Again, I greatly appreciate your help and if at all possible I hope I can help someone else avoid the many hours of frustration and stress I've experienced with this one. Thank you again Mitch.
Then a week later he replied as follows when I pinged him concerning his situation:
Latest update is that I have been escalated to Sr. Support at Webroot and have had them confirm that my anecdotal observation is correct in that there is a correlation between the number of users on the system (user sessions) and the frequency of the occurrence.
Just this morning they sent me an older build to test to see if the issue 'goes away' and then if it does, they want to apply the next build and see if it reappears...you get the idea from there.
While it has been kind of painful in terms of having to check on these servers pretty frequently to try to stay ahead of the problem or at least catch and reboot it before it affects anyone, at least they are working on helping.
The heck of it is, what do you do? You can't just remove AV because that opens the possibility for far worse events...so I'll keep at it and keep my fingers crossed.
I'll keep you updated as this unfolds. Thank you for checking back with me. I just hope someone else can benefit from my experience as I am sure I have benefited from the experiences of others in the past whether knowingly or not.
I've reproduced all this as I now want to bring you up to date concerning Darby's problem and the final workaround he achieved as I feel his story says several important things about today's incredibly complex IT environments and what it's like to work "in the trenches" in such environments.
After that last update from May, Darby reached out to me again in June with the following:
Hi Mitch, I wanted to send along the latest update in the ongoing saga. The AV vendor's support sent me a beta last week that I could not install until this last weekend (no opportunity to perform the requisite reboot associated with the change) that they said had not experienced the login hanging issue we were experiencing and suggested that I try it. By the time I was able to install this beta, the live version was already caught up and so I was essentially no longer using a beta after the auto update. I will, of course, monitor the situation now but interestingly I did get a message back from one of the senior support staff at the vendor after notifying him that I had applied the updated build saying that the current live build has not yet had a reported case of this error and that they are asking customers to ensure that all terminal servers running this product is on this latest build and has all currently available Microsoft updates installed. They also said that since they were were never able to reproduce in house and we did not make a code change that was said to resolve this, they are leaning towards a MS update that resolved this but the investigation continues.
What's interesting is that I had a case open with Microsoft on this well before I had found the post online pointing to their software as a possible cause. So who knows... Anyways, I'll send another update in a week or two or if something comes up sooner. Thank you as always for your help and support.
To close the loop I pinged Darby again towards the end of July and he replied:
Hi Mitch, Thanks for checking back in. To be honest, at this point I do not have a definitive resolution. We had tried a couple older builds but because this is a production server with 30+ users at any given time, I just couldn't continue waiting for all the users to lose access without warning.
What I have done is schedule reboots three times a week for this server and that seems to be enough to prevent the issue for the last couple months.
I wish I had a better answer for you, but I was down to the point where I had to either risk losing the client or remove the AV software altogether. At least this is a compromise I can live with.
What can we learn from all this. Three things in my opinion. First, most vendors of enterprise software products will bend over backwards as this one did to try and help their customers resolve any problems that may arise from (or may simply seem to have arisen from) their products. Second, most IT infrastructures consist of many interrelated parts that often have numerous interdependencies that are difficult to understand when it comes to troubleshooting problems. And third, real-world IT is basically all about keeping the network going and the servers running in order to support business applications and services, so a workaround in the hand is probably worth a dozen solutions in the bush.
In fact, I can't count how many times I've heard sysadmins tell me that they need to reboot a server periodically to fix some difficult to troubleshoot issue that's causing problems for the business. There's nothing a priori wrong with scheduling reboots if that's what it takes to keep operations going. So while Darby's current workaround certainly isn't perfect, what is perfect in IT?
To sum up then, my own view is that good IT is basically any IT that works and that does the job regardless of whether it's elegant or not. It's sort of like finding the area under an algebraic curve by using the Monte Carlo method instead of looking up the formula for the integral. What about you readers out there? Do you agree or not with this perspective? I'd love to hear feedback from those of you out there that are working in the trenches doing real-world IT to support for-profit businesses. Email me at firstname.lastname@example.org
Got anything more to add on this subject? Email us at email@example.com
IT Security Risk Control Management: An Audit Preparation Plan (Apress)
This book explains how to construct an information security program, from inception to audit, with enduring, practical, hands-on advice and actionable behavior for IT professionals.
Available for pre-order from Amazon:
SQL Server 2016 Essentials for the Oracle Database Administrator
Are you an Oracle Database Administrator (DBA), looking to support SQL Server databases after a migration project? Check out this course, especially if you need to get spun up quickly on SQL Server 2016. Learn the essentials of SQL Server 2016, along with the things you need to be successful in running SQL Server within your organization.
"What we wish, we readily believe, and what we ourselves think, we imagine others think also. -- Julius Caesar
Note to subscribers: If for some reason you don't receive your weekly issue of this newsletter, please notify us at firstname.lastname@example.org and we'll try to troubleshoot things from our end.
Until next week,
GOT ADMIN TOOLS or other software/hardware you'd like to recommend? Email us at email@example.com
With a multitude of sensors and a vendor agnostic platform, PRTG Network monitor enables you to use ONE solution to monitor your entire infrastructure including applications, software, hardware, cloud & virtual environments.
New Veeam Backup Free Edition v9 is a must-have free tool for ad-hoc virtual machine backup, restore and management in VMware vSphere and Microsoft Hyper-V virtual environments.
File System Security PowerShell Module makes managing permissions with PowerShell easier:
Dnsclush lets you analyze by malware site visit on DNS logs:
YALV! is a log viewer for Log4Net that allow to compare, merge and filter multiple logs file simultaneously:
GOT TIPS you'd like to share with other readers? Email us at firstname.lastname@example.org
I personally don't like Cortana and don't want it on my Windows 10 machines. Unfortunately upgrading to the Windows 10 Anniversary Update means you can no longer turn Cortana off from Settings. But there's still a way you can disable Cortana as the How-To Geek explains in this helpful article:
WindowsManagementExperts (WME) has a tip on how to use a PowerShell script to clear the ConfigMgr cache during your task sequence:http://www.wservernews.com/go/ee9npqzy/
From Rod Trent's myITForum comes this article by Joseph Yedid about how to create a DCM item in Configuration Manager to detect if the firewall is off:
Microsoft Ignite on September 26-30, 2016 in Atlanta USA
Microsoft Ignite Australia on February 14-17, 2017 at the Gold Coast Convention & Exhibition Centre, Broadbeach, QLD
Microsoft Worldwide Partner Conference (WPC) on July 9-13. 2017 in Washington, D.C.
PLANNING A CONFERENCE OR OTHER EVENT you'd like to tell our 100,000 subscribers about? Contact email@example.com
Understanding ADFS claim rules in combination with Azure ADhttp://www.wservernews.com/go/t8r5nnh9/
VMware Fusion: Top 3 mistakes your users are making
AWS OpsWorks demystified
Microsoft’s new Windows 10 readiness tool
Citrix Synergy 2016: Security Threats Are Evolving into Combination Attacks (BizTech)
Security visibility in the cloud (WindowsSecurity.com)http://www.wservernews.com/go/2njm2h8l/
Using the public cloud doesn't mean you have to sacrifice visibility into app and workload performance. The right set of tools can give IT a fuller picture. Check out this tip from our editors highlighting some of those tools.
The most recent of several Slack outages over the last year has sparked a discussion about ways to guard against and mitigate SaaS risks. Find out more about the debate and how you can prepare for SaaS failure within your environment.
Despite the advantages of cloud computing technology, concerns over security, network latency and job security have hindered public cloud adoption rates. Luckily, TechTarget’s new PI/Cloud Infrastructure Survey sheds light on common public cloud concerns that have held businesses back from widespread adoption.
Skillful VDI image management is crucial to a successful virtual desktop deployment. Find out the best ways to update and manage VDI base images in this complimentary tip from our editors.
GOT FUN VIDEOS or other fun links to suggest you'd like to recommend? Email us at firstname.lastname@example.org
Professional backyard engineer Colin Furze shows off his latest project - a 31 foot (9.5 meter) spinning 360 swing:
A compilation of amazing uneven bars gymnastics moves from previous Olympic Games, which are now banned for being too dangerous:http://www.wservernews.com/go/jhkw05in/
A brilliant street performer entertains tourists and locals just outside the Duomo Cathedral in Milan, Italy:
Mitch Tulloch is Senior Editor of WServerNews and is a widely recognized expert on Windows administration, deployment and virtualization. Mitch was lead author of the bestselling Windows 7 Resource Kit and has been author or series editor for almost fifty books mostly published by Microsoft Press. Mitch is also a ten-time recipient of Microsoft's Most Valuable Professional (MVP) award for his outstanding contributions in support of the global IT pro community. Mitch owns and runs an information technology content development business based in Winnipeg, Canada. For more information see www.mtit.com.
Ingrid Tulloch is Associate Editor of WServerNews and was co-author of the Microsoft Encyclopedia of Networking from Microsoft Press. Ingrid is also manages research and marketing for our content development business and has co-developed university-level courses in Information Security Management for a Masters of Business Administration program.