How to split a dos path into its components in Python


Question

I have a string variable which represents a dos path e.g:

var = "d:\stuff\morestuff\furtherdown\THEFILE.txt"

I want to split this string into:

[ "d", "stuff", "morestuff", "furtherdown", "THEFILE.txt" ]

I have tried using split() and replace() but they either only process the first backslash or they insert hex numbers into the string.

I need to convert this string variable into a raw string somehow so that I can parse it.

What's the best way to do this?

I should also add that the contents of var i.e. the path that I'm trying to parse, is actually the return value of a command line query. It's not path data that I generate myself. Its stored in a file, and the command line tool is not going to escape the backslashes.

1
123
7/3/2010 6:38:14 PM

Accepted Answer

I've been bitten loads of times by people writing their own path fiddling functions and getting it wrong. Spaces, slashes, backslashes, colons -- the possibilities for confusion are not endless, but mistakes are easily made anyway. So I'm a stickler for the use of os.path, and recommend it on that basis.

(However, the path to virtue is not the one most easily taken, and many people when finding this are tempted to take a slippery path straight to damnation. They won't realise until one day everything falls to pieces, and they -- or, more likely, somebody else -- has to work out why everything has gone wrong, and it turns out somebody made a filename that mixes slashes and backslashes -- and some person suggests that the answer is "not to do that". Don't be any of these people. Except for the one who mixed up slashes and backslashes -- you could be them if you like.)

You can get the drive and path+file like this:

drive, path_and_file = os.path.splitdrive(path)

Get the path and the file:

path, file = os.path.split(path_and_file)

Getting the individual folder names is not especially convenient, but it is the sort of honest middling discomfort that heightens the pleasure of later finding something that actually works well:

folders = []
while 1:
    path, folder = os.path.split(path)

    if folder != "":
        folders.append(folder)
    else:
        if path != "":
            folders.append(path)

        break

folders.reverse()

(This pops a "\" at the start of folders if the path was originally absolute. You could lose a bit of code if you didn't want that.)

149
4/22/2015 4:34:15 PM

I would do

import os
path = os.path.normpath(path)
path.split(os.sep)

First normalize the path string into a proper string for the OS. Then os.sep must be safe to use as a delimiter in string function split.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon