Welcome, guest | Sign In | My Account | Store | Cart

Find All Indices of a SubString in a Given String (Python recipe) by Bibha Tripathi
ActiveState Code (http://code.activestate.com/recipes/499314/)

I needed a version of the string.index(sub) function which returns a list of indices of ALL occurances of a substring in the string.

Is there a better/shorter/more efficient way to do this? Please share.

      def allindices(string, sub, listindex, offset):
        #call as l = allindices(string, sub, [], 0)
	if (string.find(sub) == -1):
		return listindex
	else:
		offset = string.index(sub)+offset
		listindex.append(offset)
		string = string[(string.index(sub)+1):]
		return allindices(string, sub, listindex, offset+1)

      

this can be used to do string.replaceAll(sub1, sub2) sort of thing.

Tags: text

4 comments

Rogier Steehouder 17 years, 4 months ago # | flag

non-recursive. How about:

def allindices(string, sub, listindex=[], offset=0):
    i = string.find(sub, offset)
    while i >= 0:
        listindex.append(i)
        i = string.find(sub, i + 1)
    return listindex

I prefer non-recursive functions. Also, there is no need to copy the string, because find() and index() can search from an offset themselves.

Michael Foord 17 years, 4 months ago # | flag

No Need to Pass in Empty List or Offset. How about :

def allindices(string, sub, offset=0, listindex=None): #call as l = allindices(string, sub) if listindex is None: listindex = [] if (string.find(sub) == -1): return listindex else: offset = string.index(sub)+offset listindex.append(offset) string = string[(string.index(sub)+1):] return allindices(string, sub, listindex, offset+1)

Graham Fawcett 17 years, 4 months ago # | flag

Don't reinvent the wheel. You should look at the "re" module in the standard library. To find all of the the starting positions of S in T:

import re
starts = [match.start() for match in re.finditer(re.escape(S), T)]

Plus it supports substitutions, and lots of other goodness.

Kent Johnson 17 years, 2 months ago # | flag

finditer() won't find overlapping strings. The regular expression methods, including finditer(), find non-overlapping matches. To find overlapping matches you need a loop.

Created by Bibha Tripathi on Thu, 14 Dec 2006 (PSF)

◄	Python recipes (4591)	►
◄	Bibha Tripathi's recipes (4)	►

Required Modules

(none specified)

Other Information and Tasks

Licensed under the PSF License
Viewed 86409 times
Revision 2 (updated 17 years ago)

Accounts

Code Recipes

Feedback & Information

ActiveState

© 2024 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.

Find All Indices of a SubString in a Given String (Python recipe) by Bibha Tripathi ActiveState Code (http://code.activestate.com/recipes/499314/)