Daniel Root: 2012

Tuesday, November 27, 2012

A Time for Everything from Railways to Databases

I read with interest a recent Wired post about Google Spanner- possibly one of the largest databases in existence. What piqued my interest was a historical parallel I’d picked up watching an episode of ‘History Detectives’ a while back. In the episode, they researched a strange clock that, it turned out, was used to synchronize time across railway stations in the Midwest.

A little history of time (no, not _that one_). Prior to the 18th century, towns kept accurate time by using sundials. Communication and travel were slow, and it generally didn’t matter if Ye Ole Pocket Watch was a few minutes off. When it did matter, travelers could set their clocks by the town sundial and generally be within a few minutes of accuracy. The late 1700s and early 1800s saw rail travel invented and (pun intended) pick up steam. It doesn’t take a math wiz to know that if a train leaves Oxford Station for London at 11am, and one leaves London Station for Oxford at 11am on the same track, both going 50 MPH, it ain’t good. The need to coordinate times between train stations to avoid wrecks and unhappy customers, among other things, led to time standardization and the telegraph. Greenwich would send out a message right on the dot, and the local operators would set their clocks, ensuring that Oxford sent their train out right at 11:05 so that it could just miss the one coming from London at 11:00. The History Detectives ‘Chicago Clock’ would take that time and send it down the track as far as New Orleans at high noon, so that all the stations on the line had accurate times.

The two hundred years since have seen travel and communication speeds increase. Trains got faster and started flying. Cars and highways came, and telegraph systems evolved in large part to help schedule them, build them, or to avoid traveling at all. The principles are largely the same as the railroad networks: time is key in keeping systems in sync, be it landing your flight to Hong Kong, or making sure your most recent Facebook post shows up in the right order. (Actually, the latter can afford to be off by a bit more than the former- a fact many large systems exploit).

In Google’s case, it was important to sync times between data centers across the globe that host their cash cow ad network. Without accurate time across their datacenters, transactions could potentially be stored out of order. Just like the Oxford-bound train, this is bad news for a database, where to have any hope of accurately replicating geographically, you need to know the order in which transactions are created. When billions of transactions occur per second, and come from all over the planet, you need a really accurate clock in every building to precisely tag each transaction with a time that every other building will know is accurate when the transaction arrives. Even a J.B. Mayo clock won’t cut it. So, Google did the only reasonable thing and built a redundant clock based on GPS and atomic clocks, to be installed in each data center. This way, when an ad purchase in Mumbai is copied to the Houston data center, they know it was purchased a millisecond before one in Topeka.

And there you have it: the most massive database on the planet owes a nod to the steam engine and a 19th century clock.

Wednesday, August 22, 2012

The Woz is Wrong: Content, Cocaine, and Clouds

Recently, Apple founder Steve Wozniak was quoted making dire predictions about the cloud: “Horrible things” will happen in 5 years. He mainly referred to the loss of ownership rights. But I have an different take: You never really owned that 8 track anyway.

Stop a minute to think about this. I was never privy to actual 8 tracks, but in 30-some years have lived through 3 major music formats, and a few minor ones: tape, CD, mini-disc, and MP3. I was never what you would call an aficionado, but most people I know re-purchased music during each of those formats. Almost nobody I know listens to tapes or CDs any more, and almost nobody I know transferred from record to 8 track, or record to tape, or tape to CD. (Plenty did transfer from CD to MP3) Same for movies: I remember standing in a ~~record~~ ~~tape~~ cd store in the mall thinking Laser Disc was going to be the format of the future. But those Laser Disc and BetaMax owners never transferred to DVD, and I have a box of VHS now that will probably never be used again. I know people re-purchasing their DVD collection as Blue-Ray now. I am trying to transfer DVDs to disk, but even that is slow going, and quite possibly not worth the effort. All that to say: the reality for most people is that the myth of this monumental media library that you own and pass on to posterity is a pipe dream. Since recorded media began, you've never really owned content indefinitely, thanks to the progress of recording technology.

Now, it’s easy to think that humanity has finally arrived. That the ~~MP3~~ OGG and ~~M4V~~ ~~AVI~~ ISO stored on a CD ~~DVD~~ ~~Hard Disk~~ SSD represents the final epitome of recording formats. From here on out, we own our music and our grandkids will thank us for preserving the 128-bit copy of Beasty Boys’ Rootdown. Maybe so. Or, maybe, we’re just like those star-struck hippie kids grooving to the 45RPM Beatles they just saved all week to buy. Honestly, I really am torn: I do like the idea of “owning” content, but when it takes hours to rip a DVD, and then terabytes of disc to keep it, I have to wonder if it’s really worth it. Why _not_ just rent rights to content for $10-$20/month?

Add to that my growing curmudgeondry toward all things celebrity, and there’s a danger I go from non-aficionado to all out media non-consumer. Why again would I pay 20 hard-earned bucks to own the latest Hollywood epic or must have single, and encourage some actor or musician telling me how to vote while I pay for his cocaine binge parties? No thanks. What we need is less celebrity and more quality. Start-up bands, open air concerts, and indie flicks. That is worth paying for.

But I digress. No doubt bad things will happen in the Cloud. Amazon has gone down at least once. Azure and iCloud probably will also at some point. But, then again, I know plenty of people that lost all of their photos and music thanks to “owning” it on their single hard disk with no backup. I’d venture to say most data centers are more reliable than your home PC. But for whatever bad happens in the cloud, Woz is still wrong: “horrible” things have already happened. My Hendrix CD sat on the car dash and got ruined in the heat. My wife’s VHS collection of Little House on the Prairie will likely never be enjoyed by my daughter. My Ace of Base tape cracked in my backpack. It turned out, these things weren’t that horrible after all. Somehow, I managed to live through it, and to see a day where we feed our desire for music and video with a little Amazon, Netflix and Spotify.

Monday, August 6, 2012

Information Management Policy Recurrence (SharePoint Devilish Details)

Information Management Policy is one of those oft-overlooked features in SharePoint. It can really help organizations keep content fresh, but like so much else in SharePoint, the devil is in the details. Depending on the scenario you’re trying to implement, you may find some details causing confusion.

In one recent case, a customer asked for a solution that prompted users to review content that hadn’t been modified in 365 days. Initially, this seemed straight forward. Set up policy to review every 365 days, like this:

But, reading the recurrence description made me think there may just be a devil lurking in this dialog. So I wouldn’t have to wait a year, I set up a series of tests that expire in 1 day, with varying settings. Based on those tests, here are some gotchas I learned.

The retention policy, will _not_ recur just based on Modified date. In the example above, the initial workflow will only occur once, as noted at the bottom of the dialog. So if it kicks off after 365 days, and the user changes the modified date, it will not kick off again in another 365 days unless you specify a ‘recurrence’. To me, this does seems counter-intuitive. I expected changing the modified date to reset the clock for the recurrence policy. But, I think the concept is that the item is flagged as having “expired”, and will never “expire” again, regardless of what actions the user takes. The workflow option is more intended to move the document elsewhere, archive it, etc. Still, by specifying recurrence of 365 days, you get the desired behavior: the workflow runs every 365 days.

The recurrence happens based on the _initial_ occurrence. So, after the initial review kicks off, if you specify a recurrence period of 365 days, the workflow will run every 365 days from the time the document first expired, regardless of modified date. Again, this is counter-intuitive to me. I expected- and the customer wanted- the review to occur every time the document had not been modified in 365 days. This is, in fact, the initial behavior: editing the document bumps up the initial occurrence. But, once that first “expiration” happens, the recurrence will be every N days, regardless of changes to the document.

So, how do we handle this scenario? It takes a little more logic in the workflow, but is not to difficult. Simply set the initial occurrence to 365 days after modified date, but the recurrence to a more frequent period – say 2-4 weeks. Then, add logic in the workflow to check the modified date:

This way, the workflow will be kicked off initially at 365 days. After that, every 2-4 weeks. But only if necessary will an actual review take place.

Bonus Powershell Timer Job Kick

Expiration Policy Processing occurs weekly by default. If you ever need to test, or want to trigger processing at once, you can run the timer jobs in CA, or Powershell:

Get-SPTimerJob | ?{$_.Name.Contains(“ExpirationProcessing”)} |%{$_.RunNow()}

I didn’t find many specifics on how those settings played out, so I hope this is helpful for somebody (maybe me?) down the road.

Wednesday, July 18, 2012

Nifty Code: Output Azure Blob Directory As Zip File

Following the release of several new Azure features, I’ve been tinkering again with the various cloud offerings. I’m very impressed with this release, which brings ‘Virtual Machines’ for super-cheap, super-easy Virtual Private Server capability. In addition, the new ‘Websites’ model makes developing for Azure as easy as any other hosting provider – Git, FTP, TFS, and Web Publishing support make it super simple to get code up into the cloud, where you have buttery-smooth configuration and monitoring capabilities.

In the course of my tinkering, though, I whipped up a nifty little code snippet I thought I’d share. Say you’d like to let a user download a zip archive of files and folders stored in Azure Storage from an ASP.NET MVC or Web Forms app. Thanks to the de-facto standard .NET Zip library, SharpZipLib, this is relatively easy:

       private void OutputBlobDirectoryAsZipDownload(CloudBlobDirectory directory)
        {
          
            byte[] buffer = new byte[4096];
            Response.ContentType = "application/zip";
            var fileName = directory.Uri.Segments.Last().Replace("/", "");

            Response.AppendHeader("content-disposition", "attachment; filename=\"" + fileName + ".zip\"");
            Response.CacheControl = "Private";
            Response.Cache.SetExpires(DateTime.Now.AddMinutes(3)); 
            ZipOutputStream zipOutputStream = new ZipOutputStream(Response.OutputStream);
            zipOutputStream.SetLevel(3); //0-9, 9 being the highest level of compression
            var allFiles = directory.ListBlobs(new BlobRequestOptions { UseFlatBlobListing = true }).Where(x => x.GetType() == typeof(CloudBlockBlob)).Cast<CloudBlob>();

            foreach (var file in allFiles)
            {

                Stream fs = file.OpenRead();    
                var entryName = file.Uri.ToString().Replace(directory.Uri.ToString(), "");
                ZipEntry entry = new ZipEntry(ZipEntry.CleanName(entryName));
                entry.Size = fs.Length;
              
                zipOutputStream.PutNextEntry(entry);

                int count = fs.Read(buffer, 0, buffer.Length);
                while (count > 0)
                {
                    zipOutputStream.Write(buffer, 0, count);
                    count = fs.Read(buffer, 0, buffer.Length);
                    if (!Response.IsClientConnected)
                    {
                        break;
                    }
                    Response.Flush();
                }
                fs.Close();
            }
            zipOutputStream.Close();
            Response.Flush();
            Response.End();
        }

This is essentially just their sample code adapted to work with Microsoft’s Azure StorageClient code. To use, simply use NuGet to add SharpZipLib and StorageClient to your app – it need not be an Azure application. Then call this code, passing in a CloudDirectory instance, and viola! Cloud files packaged in a single zip file for your users to download.

There are some caveats: there is an upper limit to the number of files the zip can contain, and likely in the size of the zip. I’m not sure this would work for downloading gigs of data. But, I can think of a few scenarios where it might be interesting.

Tuesday, July 10, 2012

Blackbelt SharePoint Debugging

You know the 14 hive like the back of your hand, and can lookup a correlation id in 10 seconds flat. You know how to handle everything Health Analyzer can throw at you. Central Admin bores you – you like the cozy black and white of the PowerShell console. And now, some third party thing is broken, not logging, and all tech support can do is ask you to reboot the servers (which you did 3 days before you called support- FizzBin)

You have come a long way – it’s time to complete your training and become a SharePoint Debugging Blackbelt. You will be armed with three tools:

dotPeek – a free .NET decompiler from JetBrains. Yes, we’re going there.
PowerShell – this is just .NET in the command line. You’re about to kick some SharePoint butt.
A working knowledge of SharePoint and .NET.

Rule #1: Know your Enemy

dotPeek allows you to decompile and look at the code of any managed .NET assembly. Including those in the GAC, and those written by Microsoft. SharePoint is, in the end, really just a huge ASP.NET app. Almost all SharePoint 3rd party add-ins are written in .NET. You see where I’m going here.

Isolate the problem. Do your best to get repeatable steps and know more or less what part of SharePoint or 3rd party is not working as expected. If the problem is on a particular page under _layouts, note the page, and then open it from the corresponding folder in the 14 hive. At the top, especially if it is a 3rd party, note the namespaces and assemblies involved. For example, the Page directive may have a “Inherits” attribute. Register directives may have a “Namespace” or “Assembly” attribute. Examine the page and see if you can identify a problem control.

For example, the case that led me to post this had a very specific page that was not behaving as expected. I looked at the page in the hive, and found a suspect control on the page, named SLActivityToolbox and a Nintex.Workflows.ServerControls assembly.

Install dotPeek on the server. It should be safe for production boxes, but if you can, do this in a test environment. You don’t have a test environment? You are not ready for this post then!

Examine the assemblies in dotPeek. Generally, they will be in the GAC. However, some 3rd party tools may be in the \bin folder of a web app, or even in the 14 hive. In my case, I opened Nintex.Workflows.ServerControls and started reading code. There, nestled in the ‘Render’ method, I found a suspect. It’s a given you’ll need to know some .NET code to do this, but the times I’ve done this, it’s usually fairly easy to read and figure out at least some problem areas.

Rule #2: Use your Enemy’s Weapons Against Them

In some cases, just perusing the code in dotPeek is enough to get to a solution. In other cases, it takes fighting back. Enter PowerShell, which as I’ve said is just command-line .NET. This means that you can load .NET assemblies and tinker. You see where I’m going.

I imported Nintex.Workflows.ServerControls with code like this:

$x = [Assembly]::LoadWithPartialName("Nintex.Workflow.Assemblies")

This post has a good overview of loading assemblies from various sources. Loading an assembly in this way is similar to adding a reference in a .NET project. All of the code within the assembly is now available in the commandline. Use it wisely, grasshopper.

From reading code, I suspected a method call SnippetRepository.GetAll() was the culprit. However, it needed a SPSite instance for me to call it. And here is where it really clicks that PowerShell is really just .NET. That SPSite instance can come from the built in SharePoint PowerShell commands:

$s = get-spsite http://somesite
$r = new-object Nintext.Workflow.ServerControls.SnippetRepository $s
$r.GetAll()

We get a SPSite instance and pass it to the constructor, then call GetAll. This call threw an exception, but confirmed my suspicions. Armed with this, I was able to chase down a modified library on the site and fix it, correcting the problem.

Rule #3 Use for Good, Not Evil

Another appropriate title for this post could be “How to debug SharePoint like a crazy man.” Use these techniques at your own risk. You can seriously hose a farm doing this, so only attempt in situations where you have nothing to loose or really know what you’re doing. In my case, unfortunately, it was the first.

Since the code I ran only queried and did not update, I felt somewhat safe. Plus, this was in a test environment, but even so, this probably wasn’t the safest debugging session I’ve ever done. That said, you do what you have to, and in extreme cases, it’s good to know you have some heavy weapons like dotPeek and PowerShell to bring in on particularly hairy situations.

Tuesday, July 3, 2012

Apps You Should Have (Not Just For Nerds)

I recently posted about Trello as a service that I think even non-IT-professionals may find useful. Continuing that vein, here are four other services that I think anybody with a computer should have.

Evernote

Evernote is a “second brain” (no comment on the first one). It runs on just about any computer and phone, and lets you very easily capture text, photos, files, and more. Every note gets synced to every device, and there is rich support for tagging and search. You can quickly jot notes, send emails into it, and even snap pictures of business cards and labels you want to remember, then pull them back on any device. Here are some ways I use it:

Notes for work. I constantly have to remember “what did I do to fix XYZ”. Now, if I fix something, I’ll record it in Evernote. I have literally hundreds of solutions in my own personal “Knowledge Base”.
Shopping lists. For trips to “the Man Store” (ie Lowes), or grocery, I’ll jot down what I need and refer to it at the store.
Paint numbers. Whenever I’ve painted or had things painted, I snap a picture of the label they put on the can that has the color numbers. They can then use this to match a color later.
Favorite wines/beers. If I find something I like, I snap a picture of the label on write it in a note so I can remember later.
Gate codes. I put in gate codes for storage and the pool. Since notes are marked with a location, I can easily see the notes near me and pull up, say, the storage gate code when I’m at the storage location.

So, go get Evernote. If you want to read more about how to use it, Evernote Essentials is a great book about exactly how it works and things you can do with it.

Carbonite

If you don’t have _some_ backup for your photos and other files, then you are doing it wrong. Your computer will fail, and when it does, all those picture that only live on your computer are going away. There are lots of ways to do backups, but Carbonite is dead simple and cheap. For $5 month you get unlimited backup of all of the important files on your machine.

Google Reader

News readers let you get content from just about any website or blog in a easy-to-read and relatively ad-free user interface. Think of it like an electronic newspaper where you control what makes the front page. In fact, when Simeon asks “what you doin on your iPad?” I reply “reading the newspaper”. Of all of the news readers I’ve tried, Google Reader is the best. I can check dozens of sites in a matter of minutes, as opposed to wasting time browsing. As a result, I’m always up to speed on news in my field, hobbies, as well as general national and local news. If you have an iPad or iPhone, Reeder is a great client for Google Reader.

IFTTT

If This Then That automates tons of online services in a very user friendly way. I know that sounds like something only nerds would care about, but here are some ways I use it, only some of which are uber-nerdy:

When it’s going to rain tomorrow, it sends me an email. This alone is worth it. I don’t care about the weather unless it’s going to rain tomorrow! Unlike those annoying weather alert sites, these are short one-line notices.
When there is a new movie on Netflix, it emails me.
When I post to my blog, it posts automatically to Facebook, Twitter, and LinkedIn.
When I upload a painting to Flickr, it posts automatically to Twitter and Facebook, and uploads to DropBox.

If you have a blog, those automatic posts to FaceBook and Twitter are gold!

Friday, June 8, 2012

My Wife, Developer

My wife has been working on the launch of a new system for the last 9 months, and we are both proud to announce she is officially out of beta! This amazing little package is actually comprised of 13 subsystems, with a internal architecture consisting of 78 individual modules. Detailed code analysis reveals it contains 3 billion lines of code, 20,000 classes in 23 libraries. All this in a hardware package just 8 lbs, 10 oz. We both contributed to the final release, but both feel it really is a miracle from God that this project came together the way it did. Despite this complexity, the resulting solution is extremely intelligent and has a really beautiful user interface. For the most part, all systems are running well. The only part that really stinks is in the backend, but we think in the future this will eventually take care of itself. For now, we just patch up the backend and it does fine. While this initial rollout is pretty amazing as is, we have big plans for future development, so stay tuned!

Michel Ellen, we are so proud of you and look forward to sharing life with you!

Sources
http://www.ornl.gov/sci/techresources/Human_Genome/project/info.shtml
http://howmanyarethere.net/how-many-organs-are-in-the-human-body/
http://www.biblegateway.com/passage/?search=Psalm+139%3A14&version=NKJV

Tuesday, May 29, 2012

Say Hello to Trello (Not Just For Nerds)

Occasionally in my line of work, I'll run into a website, utility, or service that has potential to really help the day-to-day lives of people not just only the IT field, but in just about any line of work. Or, perhaps even no line of work. Trello is just that sort of service. Their website describes it: "In one glance, Trello tells you what's being worked on, who's working on what, and where something is in a process." If you get how a free tool that lets you do that could be really useful, stop reading and go sign up. Otherwise, read on for a little background and ideas about how you can put this to good use.

There's Nothing New Under the Sun

Trello has its roots in a system of project management called Kanban, first developed by Toyota in the 1940s. Kanban literally means 'card board', and originally was implemented with cards placed in slots on a board hung on a Toyota factory wall. Each card represented a particular task to be done and went in rows representing the overall process. Employees would move cards to their row, and as they did the status of the assembly line would be reflected in the board. An empty slot meant somebody else had work to do to fill the slot with a new card. With the simple action of moving paper on a board, a whole car factory was run and management could see the current status of the line at any given moment. Since then, the system has developed into a number of similar and related techniques, especially used by software and hardware development teams. At it's simplest, Kanban is implemented today using a whiteboard with Post-It notes. Rows are drawn on the board, and a Post It scribbled with some task moves from row to row as work gets done. Then there are a variety of sites and systems for implementing Kanban and its variants electronically. If you look at the monitors hanging in your favorite fast-food establishment, you may even recognize your gut bomb order moving through rows as it progresses toward your hands.

Believe it or not, there are people who geek out on this stuff way more than me, but here's the takeaway: Kanban is relatively old, Japanese, and has run everything from car factories to fast food joints. But there's more it can do. Trello takes this concept and makes it something anybody can use for anything.

Almost Anything Could Use a Good List or Three

It turns out, there are tons of processes that could benefit from a good Kanban-like board. If a process has regular defined stages, it's a candidate. When you first create a board in Trello, it has three lists: ToDo, Doing, Done. To each list, you add cards that represent tasks. As the task progresses, you move it to a different row. So, a default board is great for honey-do-lists, planning parties, or just keeping up with your work day or some project. But Trello also lets you rename lists and add new ones. For example, a Sales Pipeline board may have lists like this: Lead, Introduction, Proposal, Negotiation, Deal. You would add new prospects to the Lead list, then as they progress, move them through the rows. This is handy enough for one person, but when you involve more than one, it's almost a necessity. Trello also lets you join people to cards, so that people can see "what's being worked on, and who's working on what".

There's one more aspect of Trello - and Kanban in general- that's worth discussing. Cards may also be prioritized by simply moving them up and down in the rows. Things on top are more important than things on the bottom. You may also color-code cards to communicate some status: green cards are good, red cards are bad. In Trello, you may also add comments, links, checklists, and pictures to cards to communicate even more.

If you're still not sure how this could be used, here are some ideas, drawn from real-world projects that some friends and family are working on that could be done in Trello:

Sell your house. Prioritize tasks and keep track of their status.
Run a product development team or assembly process.
Stay organized while writing a novel. Keep a todo list and a 'waiting on' list for tasks waiting on somebody else.
Track tasks in a medical study.
Plan a family vacation.
Organize your next album release.
New employee onboarding. Have cards for each step, linking to documents the new employee needs.
Plan a church workday.
Plan and deploy a large software platform.
Run a school group project

A Lot of Value for Not a Lot of Effort

In the end, what makes this appealing is that you get a lot, for not much work. It's dead simple to create and move cards around. And just like those Toyota employees in the 40s, that simple action provides tons of benefit: You can communicate, prioritize, and stay focused as a free side-effect of just moving some cards. But there are also some less-obvious side effects where Trello can really begin to help improve a process. Let's revisit that Sales Pipeline. Say there are only a few cards left in the 'Lead' list. That tells you or management something very important: you need more leads. Watching lists over time, you most likely will notice that not every lead goes all the way through to the 'Deal' list. By finding where cards stop, you see where there is room for improvement. If they stop in Negotiation, maybe pricing needs to be re-evaluated, but if they stick in Introduction, it may be time to polish the sales pitch. Again, just by moving cards in a board, you get valuable insight into your sales process that, if handled well, can result in more closed deals.

If you want to go there, it can get even deeper. Imagine now that you are tracking not just the current state of your board, but the history. You could then keep the number of cards in any particular list on any given day. Put in a graph that could show, say, the number of leads over time. You can then begin to see historical trends: leads spike in the summer, and are low in the winter. In the development world, this is known as a 'burndown chart' because in a healthy one, you see the number of cards decrease as feature cards get completed and/or bug reports go down. You might also imagine a total count of how many cards moved in a given day. This tells you the overall velocity of a given project. Tracked over time, this gives you a good idea of how active a given project is. Trello doesn't expose this sort of analytics directly (yet), but it does expose an API that 3rd parties are beginning to use to do this sort of thing. Hopefully, the concept is settling in: just by moving a few cards around, you can build pretty deep insight into a process.

TL;DR (Too Long; Didn’t Read)

You don't need to be interested in all the project management geekery to make use of Trello. This tool really has potential to be the next "Excel"*: a good general purpose tool useful for all sorts of things from the mundane todo-list up to running a car factory. I plan to post some more on how I'm personally using this tool, so stay tuned! For now, give it a spin: https://trello.com/

*and with good reason. The head hauncho Joel Spolsky worked on Excel in 1991.

Tuesday, May 15, 2012

Obscure Networking Error: Fiddler Ate My Internet

This one gets me every single time. I use the awesome network traffic analyzer Fiddler to debug particularly thorny web service or similar issues. Occasionally, though it won’t close gracefully – for example my pc hangs and I have to force-restart it, or something crash-exits it. In these cases, when the machine comes back, Internet Explorer is unable to connect to anything. Ping and other networking tools look fine. The reason is that Fiddler works as a proxy, tunneling all IE traffic through itself for inspection. If it crashes, it can’t remove the proxy, and IE is stuck tunneling to a non-existent proxy.

Fortunately, once I remember what’s going on, the fix is easy: Just open Fiddler, and close it gracefully. So, moral of the story is: If you’re IE won’t connect to anything, and you’ve been fiddling with Fiddler, open Fiddler and see if it fixes the problem!

Monday, May 7, 2012

Quick Tip: jQuery and Knockout Intellisense in Standalone JS

So simple, yet one of those things that takes stopping and looking up. If you want intellisense for jQuery and KnockoutJS, in a standalone JavaScript file, just add these lines to the top of your page:

/// <reference path="jquery-1.6.2-vsdoc.js" />
/// <reference path="knockout-2.0.0.debug.js" />

Obviously, adjust the versions and paths as needed.

Tuesday, March 27, 2012

How To: Bulk Delete Files in a Large SharePoint Document Library using PowerShell

A client recently had a case where a migration of tens of thousands of documents into a document library failed, and they wanted to delete everything and start over. It turns out there is no easy ‘Delete Everything’ button or API call in SharePoint. Instead, I came up with this gem to fairly quickly check in any ‘orphan’ files, delete a few thousand items in batches of 1000, and then delete all folders:

param($url,$libraryName)

$w = get-spweb $url
$l = $w.Lists[$libraryName]
$format = "<Method><SetList Scope=`"Request`">$($l.ID)</SetList><SetVar Name=`"ID`">{0}</SetVar><SetVar Name=`"Cmd`">Delete</SetVar><SetVar Name=`"owsfileref`">{1}</SetVar></Method>"

function BuildBatchDeleteCommand($items)
{
    $sb = new-object System.Text.StringBuilder
    $sb.Append("<?xml version=`"1.0`" encoding=`"UTF-8`"?><Batch>")
    $items | %{
        $item = $_
        $sb.AppendFormat($format,$item.ID.ToString(),$item.File.ServerRelativeUrl.ToString())
    }
    $sb.Append("</Batch>")
return $sb.ToString()
}

$count = $l.CheckedOutFiles.Count -1;
Write-Host "Taking over $count items that have never been checked in."
for($i = $count; $i -gt -1; $i--){
    $f = $l.CheckedOutFiles[$i];
    $f.TakeOverCheckOut()
    Write-Host $f.Url
}

Write-Host "Deleting $($l.Items.Count) items"
while($l.Items.Count -gt 0){
    $q = new-object "Microsoft.SharePoint.SPQuery"
    $q.ViewFields="<FieldRef Name=`"ID`" />"
    $q.ViewAttributes = "Scope=`"Recursive`""
    $q.RowLimit=1000

    $items = $l.GetItems($q)
    $cmd = BuildBatchDeleteCommand($items)
    Write-Host "Deleting $($items.Count) items..."
    $result = $w.ProcessBatchData($cmd)
    if ($result.Contains("ErrorText")){ break; }
    Write-Host "Deleted. $($l.Items.Count) items left..."
}

Write-Host "Deleting $count folders..."
$l.Folders | %{$_.Url.ToString()} | sort -descending | %{
    $folder = $_
    $folder
    $f = $w.GetFolder($folder)
    $f.Files.Count
    $f.Delete()

}

Monday, March 5, 2012

10 SharePoint Devilish Details

I have a love-hate relationship with SharePoint. There is so much that it can do, and so many ways it can help an organization solve collaboration problems, and I have yet to see a better alternative. No, the company currently advertising as a SharePoint alternative is not really a replacement for everything SharePoint does. But on many occasions, I've found that with SharePoint "The devil is in the details". There are some things that SharePoint is so awfully stupid at that I want to throw it out the window some days. This list is presented for two reasons. First, I have a desperate, irrational hope that Microsoft will fix some of these issues. Second, to educate SharePoint users on where to expect some pain points. As with any system, sales is guilty of over-selling SharePoint's capabilities, and underplaying some of these 'devil in the details' problems. With that out of the way, here are 10 annoying problems with SharePoint:

One: InfoPath browser forms do not always open in the browser

You work hard and get a nice InfoPath browser form going. It looks great in IE and Chrome, and the customer is happy. Then they get a link in a task email. Or click a link in workflow history. Or link to the form from navigation. In all cases, despite your best efforts and various settings that sound like they might work, the form opens in InfoPath Filler for IE users that have InfoPath installed. Microsoft will tell you this is by design- even the wording hints at it: Browser Compatible Forms. As in: 'This form can run in the browser if it has to'. As in: 'This form won't break in the browser, but we still think people will LOVE having a fat client pop up to fill out or view a form.' As in fix this junk: Users expect browser forms to open in the browser always for all cases, regardless of where the link is from.

Two: Worfklow Status link in a list view on a page results in error

I blogged about this one already, but imagine you have a list with a workflow. You add a view to that list to a page. Clicking workflow status column results in an exception because of a missing item in the link querystring. Dumber than a bag of hammers.

Three: Everything on the Software Limits and Boundaries Page

Kudos to Microsoft for letting it all hang out. At least the limits of SharePoint are all in one convenient place. I don't see other systems doing that. That said, when presented to end-users, these violate a core principal of UX design: Don't Make Me Think. Users create lists, get them working will with a little workflow, then WHAM get hit with errors because their library is over the huge limit of 5000 items (up from 2000!). Or their content database is over 200G, or their site collection over 100G. To be fair, until we get holographic storage arrays and quantum computers, any system has its limits. It's just a lot of the problems here could be solved without worrying the user. The 5000 item list view limit especially needs a better "Don't Make Me Think" solution.

I'd go so far as to even question the boundaries placed by site collections vs sites and subsites. Every customer I've dealt with expects the ability to easily aggregate things like task lists and calendars across their entire farm, or manage permission groups that cross site collections. I understand some of the technical reasons these boundaries exist, but the end result is that they force novice users to make choices they are not prepared to make without significant upfront understanding of the boundaries and trade-offs, and some difficult-to-make predictions of future growth. "Read and understand the entire technet planning section and formulate a governance plan before you install the bits" is a hard sell.

Four: Common Form Scenarios Require Custom Workflow and Item-level permissions

Imagine this scenario: You have a paper form. When you put it on a person's desk, you are not allowed to see any other forms but yours. That person may reject it or get you to fill out a missing section, but otherwise you cannot sneak in and thumb through other users forms, nor can you change your form after you have submitted. I just described almost every paper form process ever. Now, try doing that in SharePoint. I'll spare you: you need to write a custom workflow or event receiver to make that happen by setting item level permissions, or moving items to secured folders. But don't run into the software limits and boundaries on item level permissions. And be sure to think through the security context of the event receiver or workflow. By default, SharePoint Form Libraries allow anybody who can submit an item to also see all other items that have been submitted. Add and View permissions are tied together at a fairly low level. This topic can get involved: imagine a form where managers can only see their employee's forms. But in the end, users expect to very easily set up a form library that mimics security in real world paper processes. SharePoint does not support this without custom code.

Five: Disabling the Office Web Apps feature does not disable Office Web Apps, with bonus Licensing and Uninstall Fiasco

Office Web Apps lets users display and edit Office docs in the browser, even if they don't have office installed on their machine. Sounds cool, except the license states that only licensed Office users can use it. So, if you _have_ a license for Office, you can use the browser apps on your other non-Office-having machine. I guess if you have a work PC with Office, and a home Mac without, that would be mildly useful. If it were me, though, I'd get Office at home. Personally, I've found most users to be happy opening Word documents in full Word client. To each their own. No matter: I'll just uninstall it. Except Microsoft's own guidance states that uninstalling Office Web Apps will remove the server from the farm. Because clearly, if you don't want Office Web Apps, you also feel like rebuilding the farm. Sigh. Never mind- I'll just disable the feature on every site collection. Ok, but then if users don't have Word installed, or if they use a browser that doesn't tell the server the client has Word installed (ie isn't IE), then Office Web Apps will _still_ open the document in the browser. There is a hack for this one, but in the end, users expect disabling a feature really _disables the feature_.

Six: Managed Metadata Doesn't Work Everywhere

One of the much-talked-about new features of 2010 is Managed Metadata. On the surface, this is a welcome addition. It adds the ability to create taxonomies and folksonomies that cross site collections. However, this functionality is not supported in every case. Specifically, it is not supported in Content Organizer, Sandbox solutions, client object model, or InfoPath Browser forms. Various levels of hackery exist to work with each of these situations, but we shouldn't have to hack. What's the point of a killer new feature if it's half-baked?

Seven: Cryptic errors and No ULS Log Viewer

Correction: Dozens of 3rd party ULS Log Viewers because Microsoft expects you to muddle through text files to find errors. This should be baked-in and web-accessible. But the best log viewer doesn't help if the log entries are all "PC Load Letter". If I had a dollar for every 'unexpected error occurred' I've seen in SharePoint, I'd be retired and fly fishing in Montana instead of blogging about SharePoint. Here's a gem I got in the ULS log just today:

Microsoft.SharePoint.UserCode.SPUserCodeSolutionProxiedException: <nativehr>0x8102009b</nativehr><nativestack></nativestack>

Seriously, what am I supposed to do with that? Fortunately, I was able to Google '0x8102009b' and resolve, but I shouldn't have to. Errors like that are so 1980.

Eight: Vomitous Markup

SharePoint 2010 has gotten much better about browser support. But one look at the markup for any given SharePoint page makes me want to puke. Tables within tables within tables, except when it's divs within divs within divs, and dozens of javascripts, and miles of CSS. And let's not even talk about custom branding. Again, I'm trying to be fair: it does a ton, and has come a long way since 2007. But let's have some meaningful, beautiful, minimal markup, accessibility, and true cross-browser compatibility.

Nine: SQL Team vs SharePoint Team vs Office Team: FIGHT!

SharePoint, it turns out, is developed as a conglomeration of various teams within Microsoft. It roughly falls under Office, but in practice there is some disconnect between Office "Server" and Office "Client" teams. InfoPath lack of support for Managed Metadata is one good example of this. By all accounts, this was "on the list", but simply couldn't be accomplished in time for release. Or a CU. Or SP1 a year later. But if Client vs Server teams have some issues, SQL vs SharePoint teams are worse. Reporting Services, PowerPivot, and PerformancePoint all plug in to SharePoint, but each have unique idiosyncrasies or downright bugs that can be a beast. Reporting Services, for example, does not install as a service application, but instead as a standalone service running on one of your SharePoint boxes (This changes in SQL RS 2012, supposedly). But call support for Reporting Services, and you'll get run-around and bounced back and forth between "SharePoint Support" and "SQL Support". This is perhaps unavoidable due to the nature of the product, but unavoidable or not, users expect an "integrated" solution to be truly integrated, with no awkward seams, and typically do not care which support team or development team is responsible.

Ten: The 14 Hive

There's nothing Common or Shared about it, and it's a stretch to call it a Web Server Extension. Put that junk in C:\Program Files\SharePoint 2010 already.