Welcome, guest | Sign In | My Account | Store | Cart

This recipe reads a file containing stack traces from Linux' backtrace functions and extends each stack frame line with the source file name and line number. Usage and an example are included.

Python, 334 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
#! /usr/bin/env python

# -*- coding: iso-8859-1 -*- #

# Copyright, license and disclaimer are at the end of this file.

'''Usage: backtrace2line <backtrace_file> [<src_dir> ...]

   This program adds the source file name and line number to stack
   traces generated by Linux' function backtrace_symbols_fd.

   Each stack frame line**) in the input file originating from the
   function backtrace_symbols_fd is extended with the file name and
   line number returned by utility  addr2line.  All other lines from
   the input file remain unchanged.

   However, line numbers returned by  addr2line are often inaccurate
   and typically too high.  If any <src_dir> arguments are supplied
   on the command line, the line numbers from  addr2line may be
   adjusted after searching the source file for the nearest lines
   containing the symbol.  Adjusted line numbers are marked with *.

   Symbol search is simplistic.  Code and comment lines in source
   files are searched within a limited range around the line number
   returned by  addr2line.

   Example of an stack trace obtained from an instrumented Python
   2.5.1 binary (abbreviated), before

   /Python-2.5.1/python(PyObject_Free+0x15b)[0x8088b07]
   /Python-2.5.1/python(PyDict_SetItem+0x1a6)[0x80824ae]
   /Python-2.5.1/python(_PyModule_Clear+0x146)[0x80854ca]
   /Python-2.5.1/python(PyImport_Cleanup+0x291)[0x80d6b25]
   /Python-2.5.1/python(Py_Finalize+0xaf)[0x80e3b53]
   /Python-2.5.1/python(Py_Main+0x30b)[0x80565d7]
   /Python-2.5.1/python(main+0x17)[0x80562c7]

   after backtrace2line

   /Python-2.5.1/python(PyObject_Free+0x15b)[0x8088b07]  Objects/obmalloc.c:1123
   /Python-2.5.1/python(PyDict_SetItem+0x1a6)[0x80824ae]  Objects/dictobject.c:412
   /Python-2.5.1/python(_PyModule_Clear+0x146)[0x80854ca]  Objects/moduleobject.c:136
   /Python-2.5.1/python(PyImport_Cleanup+0x291)[0x80d6b25]  Python/import.c:469
   /Python-2.5.1/python(Py_Finalize+0xaf)[0x80e3b53]  Python/pythonrun.c:419
   /Python-2.5.1/python(Py_Main+0x30b)[0x80565d7]  Modules/main.c:565
   /Python-2.5.1/python(main+0x17)[0x80562c7]  ./Modules/python.c:24

   after backtrace2line ... with <src_dir> adjusting several line numbers

   /Python-2.5.1/python(PyObject_Free+0x15b)[0x8088b07]  Objects/obmalloc.c:1123
   /Python-2.5.1/python(PyDict_SetItem+0x1a6)[0x80824ae]  Objects/dictobject.c:412
   /Python-2.5.1/python(_PyModule_Clear+0x146)[0x80854ca]  Objects/moduleobject.c:136*
   /Python-2.5.1/python(PyImport_Cleanup+0x291)[0x80d6b25]  Python/import.c:468*
   /Python-2.5.1/python(Py_Finalize+0xaf)[0x80e3b53]  Python/pythonrun.c:397*
   /Python-2.5.1/python(Py_Main+0x30b)[0x80565d7]  Modules/main.c:545*
   /Python-2.5.1/python(main+0x17)[0x80562c7]  ./Modules/python.c:23*

----
  *) Stack frames from backtrace_symbols_fd are listed in order
     most-recently-called-first, i.e. main is near the bottom
     of each backtrace list.

 **) Stack frame lines from  backtrace_sybols_fd have one of
     the following three formats:

        <path>(<function>*0x<offset>)[0x<address>]
     or
        <path>[0x<address>]
     or
        [0x<address>]

     where * means + or -.  Only frames starting with a <path>
     can be extended with a file name and line number and only
     frames containing a <function> can be adjusted.
'''

__version__ = '1.2 (Dec 20, 2010)'

import os, sys

class _Frame(object):
    '''Stack frame object.
    '''
    addr = ''    # address string, 0x...
    call = None  # frame called
    file = ''    # source file name
    line = ''    # source line number string
    lino = 0     # line number in original, input file
    name = ''    # function name/symbol iff present
    path = ''    # library or executable path

    def __init__(self, text, lino=0, call=None, skip=None):
        b = text.rfind('[0x')
        if b > 0:
             # get address
            a = text[b:].rstrip()
            if a.endswith(']') and len(a) <= 20:  # 64-bit
                 # save in frame
                self.addr = a[1:-1]
                self.call = call
                self.lino = lino
                if skip:  # till start of path
                    a = text.find(skip, 0, b) + 1
                else:
                    a = 0
                 # get path till '(' iff present
                if text[b-1] == ')':
                    p = text.find('(', a, b) 
                    if p > a:
                        self.path = text[a:p]
                         # get function till offset
                        f = text.find('0x', p, b) - 1
                        if f > 0 and (text[f] == '+' or text[f] == '-'):
                            self.name = text[p+1:f]
                else:
                    self.path = text[a:b]

_ARG_MAX = 256  # some limit for addr2line

def _addr2line(path, *args):
    '''Return list of filename:linenumbers for
       address(es) in a library or executable.
    '''
     # global _addr2line_bin
    r, a = [], args
     # recurse for long lists
    if len(a) > _ARG_MAX:
        r = _addr2line(path, *a[:_ARG_MAX])
        a = a[_ARG_MAX:]
     # _addr2line_bin prints one filename:linenumber
     # line per address or '??:0' in case of errors
    t = "%s -e %s %s" % (_addr2line_bin, path, ' '.join(a))
    try:
        p = os.popen(t)
        r.extend(p.readlines())
        p.close()
    except:
        _print("%.*s... (%d) failed", 80,t, len(a))
        return []
    if len(r) != len(args):
        _print("%.*s... mismatch: %d vs %d", 80,t, len(r), len(args))
    return r

def _adjust(frames, srcpath):
    '''For each frame, find the function called in the
       source file near the line number from  addr2line
       and if found adjust the line number accordingly.
    '''
    def _cmp(f1, f2):
        if f1.file < f2.file:
            return -1
        if f1.file > f2.file:
            return +1
        return 0
     # sort frames by file
    frames.sort(_cmp)
     # cache source lines
    p, t, r = '', [], 0
    for f in frames:
        if f.call and f.call.name and f.file != '??':
             # get source file as lines
            p, t = _source(_which(f.file, srcpath), p, t)
            if t:  # find nearest line(s)
                n = _search(f.call.name, t, int(f.line.rstrip()))
                if n:  # adjust frame line
                    f.line = '+'.join([str(i) for i in n]) + '*' + os.linesep
                    r += len(n)
    return r  # number of adjustments

def _backtrace2line(name, ldpath=None, srcpath=None):
    '''Get a backtrace file and use addr2line
       on every stack frame line in that file,
       with adjusted line number if requested.
    '''
    try:
        f = open(name, 'rt')
        ts = f.readlines()
        f.close()
    except:
        _print("open failed: %r", name)
        return  # None
     # skip till start of path
    s = ' ' + os.path.sep
     # create a _Frame instance for each stack
     # frame line and collect all _Frames of a
     # library or executable in a separate list
    pfs, f = {}, None
    for i, t in enumerate(ts):
        f = _Frame(t, i, f, skip=s)
        p = f.path
        if p in pfs:
            pfs[p].append(f)
        elif p:
            pfs[p] = [f,]
        else:  # not a <path> frame
            f = None
     # for each library or executable, get a list
     # of addresses, pass those to add2line, save
     # the file name and (adjusted) line number in
     # the _Frame and append the final result to
     # the original line of the input file
    r = 0
    for p, fs in pfs.iteritems():
        p = _which(p, ldpath)
        if p:
            a = [f.addr for f in fs]
            s = _addr2line(p, *a)
            if s:
                for i, f in enumerate(fs):
                    f.file, f.line = s[i].split(':')
                if srcpath:
                    r += _adjust(fs, srcpath)
                for f in fs:
                    i = f.lino
                    t = ts[i].rstrip()  # line break
                    ts[i] = t + '  ' + f.file + ':' + f.line
    ##_print("%d adjusted line numbers*", r)
    return ''.join(ts)  # as string

def _print(fmt, *args):
    '''Print a message.
    '''
    print "%s: %s" % (sys.argv[0], (fmt % args))

def _search(name, text, line, before=32, after=16):
    '''Find name in text lines within a range
       before and after a given line number.
    '''
    b = max(line - max(0, before), 1)  # 1-origin
    a = min(line + max(0, after), len(text))
    r = []  # search back- and forward
    for n, s in [(line, -1), (line+1, +1)]:
        while b <= n <= a:
            if text[n-1].find(name) < 0:
                n += s
            else:
                r.append(n)
                break  # while
    return r  # list of 0, 1 or 2 line numbers

def _source(path, prev, text):
    '''Get source lines of a source file
       iff different from the previous one.
    '''
    if path == prev:
        t = text
    elif path:
        try:
            f = open(path, 'rt')
            t = f.readlines()
            f.close()
        except:
            t = []
    else:
        t = []
    return (path, t)

def _which(name, PATH=None, exit=0):
    '''Find fully qualified path for a file.
    '''
    n = os.path.expanduser(name)
    if os.path.isabs(n):
        return n
    p = PATH or os.environ.get('PATH', '')
    for d in p.split(os.pathsep):
        f = os.path.join(d, n)
        if os.path.isfile(f):
            return f
    if exit:
        _print("utility %r missing", name)
        sys.exit(exit)
    return None

_addr2line_bin = _which('addr2line', exit=os.EX_OSFILE)

if __name__ == '__main__':

    argc = len(sys.argv)
    if argc < 2 or sys.argv[1].startswith('-'):
        _print("usage: %s <backtrace_file> [<src_dir> ...]", os.path.basename(sys.argv[0]))
        sys.exit(os.EX_USAGE)

     # default library path
    ldp = '/usr/local/lib:/usr/lib:/lib'
    if sys.platform.startswith('darwin'):
        ldp = os.environ.get('DYLD_LIBRARY_PATH',
              os.environ.get('DYLD_FALLBACK_LIBRARY_PATH', ldp))
    else:  # assume *nix
        ldp = os.environ.get('LD_LIBRARY_PATH', ldp)

    if argc > 2:  # check src_dirs
        ds = sys.argv[2:]
        for d in ds:
            if not os.path.isdir(d):
                _print("not a directory: %r", d)
                sys.exit(os.EX_OSFILE)
        print _backtrace2line(sys.argv[1], ldp, os.pathsep.join(ds))

    else:
        print _backtrace2line(sys.argv[1], ldp, None)


#---------------------------------------------------------------------
#   Copyright (c) 2007-2010 -- Jean Brouwers.  All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# - Redistributions of source code must retain the above copyright
#   notice, this list of conditions and the following disclaimer.
#
# - Redistributions in binary form must reproduce the above copyright
#   notice, this list of conditions and the following disclaimer in
#   the documentation and/or other materials provided with the
#   distribution.
#
# - Neither the name Jean Brouwers nor the names of any of the
#   contributors may be used to endorse or promote products derived
#   from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
# FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
# COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
# STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
# OF THE POSSIBILITY OF SUCH DAMAGE.
#---------------------------------------------------------------------

3 comments

Jean Brouwers (author) 16 years, 4 months ago  # | flag

It turns out that the line numbers returned by addr2line are inaccurate and typically too high by as much as 24 lines.

Jean Brouwers (author) 16 years, 4 months ago  # | flag

An enhancement of this recipe attempts to adjust the line number from addr2line by searching for the function name in the source file. Unfortunately, posting the enhanced recipe as a Comment scrambles and truncates the text.

Jean Brouwers (author) 16 years, 1 month ago  # | flag

The recipe source text has been updated with the enhanced version.