Welcome, guest | Sign In | My Account | Store | Cart

Boyer-Moore-Horspool string searching (Python recipe) by Nelson Rush
ActiveState Code (http://code.activestate.com/recipes/117223/)

A string searching algorithm based upon Boyer-Moore string searching, which is considered one of the most efficient string searching algorithms. Boyer-Moore-Horspool only uses the bad-suffix window for matching and is therefore simpler to implement and faster than normal BM.

      # bmh.py
#
# An implementation of Boyer-Moore-Horspool string searching.
#
# This code is Public Domain.
#
def BoyerMooreHorspool(pattern, text):
    m = len(pattern)
    n = len(text)
    if m > n: return -1
    skip = []
    for k in range(256): skip.append(m)
    for k in range(m - 1): skip[ord(pattern[k])] = m - k - 1
    skip = tuple(skip)
    k = m - 1
    while k < n:
        j = m - 1; i = k
        while j >= 0 and text[i] == pattern[j]:
            j -= 1; i -= 1
        if j == -1: return i + 1
        k += skip[ord(text[k])]
    return -1

if __name__ == '__main__':
    text = "this is the string to search in"
    pattern = "the"
    s = BoyerMooreHorspool(pattern, text)
    print 'Text:',text
    print 'Pattern:',pattern
    if s > -1:
        print 'Pattern \"' + pattern + '\" found at position',s

      

This algorithm is more efficient than KMP and has low overhead to implement. It is one of the few string searching algorithms that balances memory consumption and speed very well. There have been many comparisons and studies on this and other string searching algorithms in the field and charts can be found which prove the usefulness of this algorithm. According to Moore himself, this algorithm gets faster the larger the pattern.

Tags: algorithms

Created by Nelson Rush on Thu, 7 Mar 2002 (PSF)

◄	Python recipes (4591)	►
◄	Nelson Rush's recipes (8)	►

Required Modules

(none specified)

Other Information and Tasks

Licensed under the PSF License
Viewed 23054 times
Revision 1

Accounts

Code Recipes

Feedback & Information

ActiveState

© 2024 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.

Boyer-Moore-Horspool string searching (Python recipe) by Nelson Rush ActiveState Code (http://code.activestate.com/recipes/117223/)