Welcome, guest | Sign In | My Account | Store | Cart
export VAR1=foo
export VAR2=bar
export VAR3=$VAR1$VAR2
export VAR4=${VAR1}$VAR2
  export VAR5=${VAR1}indent
export VAR6="text${VAR1} " # With embedded spaces and a comment
export VAR7='${VAR4}' # Leave text within tics as-is

will be read as:

{'VAR1': 'foo',
 'VAR2': 'bar',
 'VAR3': 'foobar',
 'VAR4': 'foobar',
 'VAR5': 'fooindent',
 'VAR6': 'textfoo ',
 'VAR7': '${VAR4}'}
Python, 37 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/usr/bin/env python3
import pprint
import re
import sys
TIC = "'"
QUOTE = '"'
def parse_profile(file_name):
    return_dict = dict()
    with open(file_name) as reader:
        for line in reader.readlines():
            line = re.sub(r"export\s+", "", line.strip())
            if "=" in line:
                key, value = line.split("=", 1)
                # Values that are wrapped in tics:  remove the tics but otherwise leave as is
                if value.startswith(TIC):
                    # Remove first tic and everything after the last tic
                    last_tic_position = value.rindex(TIC)
                    value = value[1:last_tic_position]
                    return_dict[key] = value
                    continue
                # Values that are wrapped in quotes:  remove the quotes and optional trailing comment
                elif value.startswith(QUOTE): # Values that are wrapped quotes
                    value = re.sub(r'^"(.+?)".+', '\g<1>', value)
                # Values that are followed by whitespace or comments:  remove the whitespace and/or comments
                else:
                    value = re.sub(r'(#|\s+).*', '', value)
                for variable in re.findall(r"\$\{?\w+\}?", value):
                    # Find embedded shell variables
                    dict_key = variable.strip("${}")
                    # Replace them with their values
                    value = value.replace(variable, return_dict.get(dict_key, ""))
                # Add this key to the dictionary
                return_dict[key] = value
    return return_dict

if __name__ == '__main__':
    pprint.pprint(parse_profile(sys.argv[1]))

2 comments

Paolo Mossino 9 years, 1 month ago  # | flag

Nice, but you are parsing a SH-like profile, so it sounds better if you will not expand $variables in 'single quoted ${string}'.

I propose to change the script in order not replace variable if it's a single quoted string. We can reset a variable called is_single_quoted every time you get an "=":

# ...
if "=" in line:
    is_single_quoted = False

Then, if the value starts with ', you also set:

    if value.startswith("'"):
        # ...
        is_single_quoted = True

Last change, start the $variable replace only if not single quoted:

    if not is_single_quoted:
        for variable in re.findall(r"\$\{?\w+\}?", value):
            # ...
Jason Friedman (author) 9 years, 1 month ago  # | flag

Good point, Paolo. I adjusted my code, though not in the manner you suggested. Please review and comment.