Wednesday, August 5, 2009

Path.Combine is essentially useless

One of the great things about .NET, coming from C++, is all the stuff that is built in.  Need to send an email ? Sure thing.  Want to use regular expressions ? Go for it.  It took me a while to learn that things I expected to write in C++, I could look for in the library and often find there already.

The System.IO namespace has a Path class, which is used to manipulate file paths.  Things like 'GetFileNameWithoutExtension' are very, very useful.  Some things are a little counter intuitive, such as Path.GetDirectory walking up the directory tree if the string you have is a directory already, but overall, it saves a lot of work.

One of the things I use the most, is Path.Combine, which takes two fragments and merges them to make a path.  In the past, I'd be checking if one string had a trailing slash, if the other had a leading slash, etc.  Path.Combine takes care of that for you.  Right ? Not quite.

There's a couple of quirks here.  To illustrate, the following table has three columns.  The first two are the arguments passed into Path.Combine, the third is the result.


c:\path\dir\file.txtc:\path\dir\file.txt
c:\path\\dir\file.txt\dir\file.txt
c:\pathdir\file.txtc:\path\dir\file.txt
c:\path\dir\file.txt\dir\file.txt
c:dir\file.txtc:dir\file.txt
c:\dir\file.txt\dir\file.txt

The first thing to notice, is that if the second string starts with a \, then you get the second string back verbatim.  This is the issue that hit me in the past.  I assumed that this method existed so no matter what slashes happened to be in the two strings, they would get joined into a single path.  As you can see, this is not so.  Now, I assume there's a specific case for which this behaviour is desirable, but it's not the most obvious one to me, and if there's a reason for it, surely the method could have an overload, or better yet, a method called something in line with the reasoning for not combining these two strings could exist ( a method called Combine, is one I call to combine strings, not to SOMETIMES Combine them ).

The second one is more interesting.  If my first string is a drive letter, with no slash in it, then no slash is added.  I just did a test, I have a file called c:\procs.txt.  File.Exists (@"c:procs.txt") returns false, File.Exists(@"c:\procs.txt") returns true.  So, it seems to me that the slash is needed, but Path.Combine does not add it.  

Overall, this method is basically broken as far as I am concerned, and I have rolled my own version to use instead.  It basically makes sure the first string has a \ at the end, the second doesn't have one at the start, then calls Path.Combine, just out of spite ( given that at this point I could just concatenate the two strings and be done with it )
 

6 comments:

  1. IMHO, this is a fundamental problem with the "framework" (vs. "library") mindset: if you aren't writing the app the designers build the framework for, you'll run into bizarre scenarios where what looks like a general-purpose manipulation routine in fact exists for a very specific purpose (one not aligned with your own...)

    Near as i can tell, Path.Combine() is intended to produce a sane union of a user-specified path and an application-defined "base path". Think: typing a path into the "file name" entry field in the standard "File Open" dialog.

    In this specific scenario, several assumptions are made with regard to the potential nature of both paths, and the desired outcome.

    For the user path:
    - it could be a relative path (including a simple file name) - in this case, it should be combined with the base path.
    - it could be an absolute path - in this case, the base path should be ignored.
    - it could be a "drive-relative" path (a path relative to the root directory of the Current Drive) - in this case, the base path is ignored, and the user path is left alone.

    Note that the last case is one almost never actually desired in modern Windows applications (where the notions of Current Drive and Current Working Directory that mattered so much under DOS make little sense), but still supported (presumably for legacy reasons).

    For the base path:
    - it could be a full drive+directory path
    - it could be a drive-relative path
    - it could be a drive specification only

    All of these are combined ONLY with path-relative user paths...

    By now, we're pretty far into options that no one using Windows GUI apps has cared about in well over a decade. Again, we have the notion of drive-relative paths, and also stand-alone drive specifications. It helps to think back to that "File Open" dialog, and that ancient option to change the CWD in response to user actions. But it doesn't help much. In real life, users specifying drive-relative paths... or apps specifying only drive-specifiers for base paths... are less features, more frustrating bugs waiting to happen.

    Presumably, this all made sense to whoever wrote it. Perhaps he'd been working with the Windows file system so long that it seemed an obvious way to behave. But for those of us *not* planning to do clean room implementations of Explorer in .NET, a straight-forward, separator-intelligent, application-agnostic Path.Concat() would have been far more useful.

    ReplyDelete
  2. It seems sillier when you look at the quite logical output of the PathCombine function in shlwapi:

    c:\path\ + dir\file.txt = c:\path\dir\file.txt
    c:\path\ + \dir\file.txt = c:\dir\file.txt
    c:\path + dir\file.txt = c:\path\dir\file.txt
    c:\path + \dir\file.txt = c:\dir\file.txt
    c: + dir\file.txt = c:\dir\file.txt
    c: + \dir\file.txt = c:\dir\file.txt

    ReplyDelete
  3. I wasn't aware of the leading slash issue for the 2nd input. Good to know, thanks for making this known, CG.

    ReplyDelete
  4. Actually I just faced another problem with ending /. If you have a folder end with / Path.Combine won't put \ anymore.
    c:\Test with / + file.txt = c:\Test with /file.txt
    the correct one should be c:\Test with/\file.txt

    ReplyDelete
  5. Did you post this issue over some MS forum

    MS do vague things sometimes :/

    ReplyDelete
  6. This always drives me crazy and I feel exactly the same as you do about it. In anger I typed "Path.Combine is useless" into my address bar and I'm pleased that I got a result that says pretty much exactly what I was thinking. (I have the same problem with constructing Urls, too; if you chain the Uri constructor that takes a Uri and a string, it's vulnerable to leading slashes in the string too. *stab*stab*stab*)

    ReplyDelete