While dealing with paths, it's often necessary to make sure they all have the same structure so any operation you perform on them can be reliable, specially when it comes to comparing two or more paths. Unusual paths like "/this//is//a///path" or "another/path" can cause unexpected behavior in your application and this is where this function comes into play.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | # -*- coding: utf-8 -*-
from re import compile as compile_regex
_MULTIPLE_PATHS = compile_regex(r"/{2,}")
def normalize_path(path):
"""
Normalize ``path``.
It returns ``path`` with leading and trailing slashes, and no multiple
continuous slashes.
"""
if path:
if path[0] != "/":
path = "/" + path
if path[-1] != "/":
path = path + "/"
path = _MULTIPLE_PATHS.sub("/", path)
else:
path = "/"
return path
# ------ UNIT TESTS ------
from nose.tools import eq_
class TestNormalizingPath(object):
"""Tests for :func:`normalize_path`."""
def test_empty_string(self):
path_normalized = normalize_path("")
eq_(path_normalized, "/")
def test_slash(self):
path_normalized = normalize_path("/")
eq_(path_normalized, "/")
def test_no_leading_slash(self):
path_normalized = normalize_path("path/")
eq_(path_normalized, "/path/")
def test_no_trailing_slash(self):
path_normalized = normalize_path("/path")
eq_(path_normalized, "/path/")
def test_trailing_and_leading_slashes(self):
path_normalized = normalize_path("/path/")
eq_(path_normalized, "/path/")
def test_multiple_leading_slashes(self):
path_normalized = normalize_path("////path/")
eq_(path_normalized, "/path/")
def test_multiple_trailing_slashes(self):
path_normalized = normalize_path("/path////")
eq_(path_normalized, "/path/")
def test_multiple_inner_slashes(self):
path_normalized = normalize_path("/path////here/")
eq_(path_normalized, "/path/here/")
def test_unicode_path(self):
path_normalized = normalize_path(u"mañana/aquí")
eq_(path_normalized, u"/mañana/aquí/")
|
The goal is just to make paths "normal", no matter where they come from (e.g., user input). Resolving relative paths and the like is out of the scope.
Nice recipe, but I think this function already exists as
os.path.normpath
And incidentally,
os.path.abspath
resolves relative paths. Seeos.path
in the library referenceThanks, Daniel! I wasn't aware of that function, but it's not the same:
The recipe above returns the same independently of the OS being used, and does not access the filesystem, so you can use it everywhere.
(os.path.normpath doesn't access the filesystem either - that's the reason it says it could change the meaning of some paths)
Also note that foo/bar and /foo/bar/ aren't the same thing -- you should not add a leading slash!
@Gabriel, you're right about normpath(), I misread the documentation. Regarding the leading slash, I really should for the program I'm working on (which is not related to actual paths in filesystems)... I think it's best for people taking the recipe to change this behavior if they want to.
The problem with os.abspath or os.normpath is that on Windows they break Unix paths: