Welcome, guest | Sign In | My Account | Store | Cart

While dealing with paths, it's often necessary to make sure they all have the same structure so any operation you perform on them can be reliable, specially when it comes to comparing two or more paths. Unusual paths like "/this//is//a///path" or "another/path" can cause unexpected behavior in your application and this is where this function comes into play.

Python, 70 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# -*- coding: utf-8 -*-
from re import compile as compile_regex

_MULTIPLE_PATHS = compile_regex(r"/{2,}")


def normalize_path(path):
    """
    Normalize ``path``.
    
    It returns ``path`` with leading and trailing slashes, and no multiple
    continuous slashes.
    
    """
    if path:
        if path[0] != "/":
            path = "/" + path
        
        if path[-1] != "/":
            path = path + "/"
        
        path = _MULTIPLE_PATHS.sub("/", path)
    else:
        path = "/"
    
    return path


# ------ UNIT TESTS ------
from nose.tools import eq_


class TestNormalizingPath(object):
    """Tests for :func:`normalize_path`."""
    
    def test_empty_string(self):
        path_normalized = normalize_path("")
        eq_(path_normalized, "/")
    
    def test_slash(self):
        path_normalized = normalize_path("/")
        eq_(path_normalized, "/")
    
    def test_no_leading_slash(self):
        path_normalized = normalize_path("path/")
        eq_(path_normalized, "/path/")
    
    def test_no_trailing_slash(self):
        path_normalized = normalize_path("/path")
        eq_(path_normalized, "/path/")
    
    def test_trailing_and_leading_slashes(self):
        path_normalized = normalize_path("/path/")
        eq_(path_normalized, "/path/")
    
    def test_multiple_leading_slashes(self):
        path_normalized = normalize_path("////path/")
        eq_(path_normalized, "/path/")
    
    def test_multiple_trailing_slashes(self):
        path_normalized = normalize_path("/path////")
        eq_(path_normalized, "/path/")
    
    def test_multiple_inner_slashes(self):
        path_normalized = normalize_path("/path////here/")
        eq_(path_normalized, "/path/here/")
    
    def test_unicode_path(self):
        path_normalized = normalize_path(u"mañana/aquí")
        eq_(path_normalized, u"/mañana/aquí/")

The goal is just to make paths "normal", no matter where they come from (e.g., user input). Resolving relative paths and the like is out of the scope.

5 comments

Daniel Lepage 14 years, 2 months ago  # | flag

Nice recipe, but I think this function already exists as os.path.normpath

>>> import os.path
>>> os.path.normpath('/foo//bar/../baz////././../baz/spam')
'/foo/baz/spam'

And incidentally, os.path.abspath resolves relative paths. See os.path in the library reference

Gustavo Narea (author) 14 years, 2 months ago  # | flag

Thanks, Daniel! I wasn't aware of that function, but it's not the same:

"On Windows, it converts forward slashes to backward slashes. It should be understood that this may change the meaning of the path if it contains symbolic links!"

The recipe above returns the same independently of the OS being used, and does not access the filesystem, so you can use it everywhere.

Gabriel Genellina 14 years, 2 months ago  # | flag

(os.path.normpath doesn't access the filesystem either - that's the reason it says it could change the meaning of some paths)

Also note that foo/bar and /foo/bar/ aren't the same thing -- you should not add a leading slash!

Gustavo Narea (author) 14 years, 2 months ago  # | flag

@Gabriel, you're right about normpath(), I misread the documentation. Regarding the leading slash, I really should for the program I'm working on (which is not related to actual paths in filesystems)... I think it's best for people taking the recipe to change this behavior if they want to.

Anastasia 7 years, 9 months ago  # | flag

The problem with os.abspath or os.normpath is that on Windows they break Unix paths:

>>> import os
>>> os.path.abspath('/my/path')
'C:\\my\\path'
>>> os.path.normpath('/my/path')
'\\my\\path'
Created by Gustavo Narea on Tue, 26 Jan 2010 (MIT)
Python recipes (4591)
Gustavo Narea's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks