Welcome, guest | Sign In | My Account | Store | Cart

This recipe enables the group_by functionality from Ruby on Rails, which is to be included in next version of Ruby. This code is from Ian Bicking's suggestion in discussing partition function in Ruby and Javascript by Ned Batchelder.

Original Post: http://www.nedbatchelder.com/blog/200607.html#e20060730T221504

Here is Ian Bicking's Response: http://www.nedbatchelder.com/reactor/comment.php?entryid=e20060730T221504

Python, 47 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
### Here is my implementation to partition based on funct's evaluation.
def partition(iterable, func):
    result = {}
    for i in iterable:
        result.setdefault(func(i), []).append(i) 
    return result

### Ian Bicking's Generalized group function
def group(seq):
    '''seq is a sequence of tuple containing (item_to_be_categorized, category)'''
    result = {}
    for item, category in seq:
        result.setdefault(category, []).append(item)
    return result 



########### Usage Example ###############
def is_odd(n):
    return (n%2) == 1

l = range(10)
print partition(l, is_odd)
print group( (item, is_odd(item)) for item in l) 
print group( (item, item%3) for item in l )      # no need for lamda/def


class Article (object):
    def __init__(self, title, summary, author):
        self.title = title
        self.summary = summary
        self.author = author

articles = [ Article('Politics & Me', 'about politics', 'Dave'),
             Article('Fishes & Me', 'about fishes', 'ray'),
             Article('Love & Me', 'about love', 'dave'),
             Article('Spams & Me', 'about spams', 'Ray'), ]

# summaries of articles indexed by author
print group( (article.summary, article.author.lower()) for article in articles )    


########### Output ###############
{False: [0, 2, 4, 6, 8], True: [1, 3, 5, 7, 9]}
{False: [0, 2, 4, 6, 8], True: [1, 3, 5, 7, 9]}
{0: [0, 3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}
{'dave': ['about politics', 'about love'], 'ray': ['about fishes', 'about spams']}

The group() function by Ian Bicking can be extremely useful when categorizing contents that are often returned in sequences.

Note also "(item, is_odd(item)) for item in l" returns a iterable to be only iterated within the group() function. So, even large list would be quite efficient.

As a comparison w/ Ruby's group_by method for a list of articles:

using block

articles.group_by { |article| article.author }

using rails enhancement to symbols

articles.group_by(&:author)

Do keep in mind categories must use immutable objects since dict requires immutable object for index. One could modify the group() function accordingly to handle categories that are given in a list instead. For a background on what I'm talking about: http://dev.rubyonrails.org/ticket/4427

2 comments

N N 17 years, 5 months ago  # | flag

For the one-liner lovers out there.

from operator import itemgetter
from itertools import groupby

dict([(k,[h[1] for h in g]) for k,g in groupby(sorted((bool(n%2),n) for n in range(10)),key=itemgetter(0))])

dict([(k,[h[1] for h in g]) for k,g in groupby(sorted((n%3,n) for n in range(10)),key=itemgetter(0))])

dict([(k,[h[1] for h in g]) for k,g in groupby(sorted((article.author.lower(),article.summary) for article in articles),key=itemgetter(0))])

These do the same thing as the examples shown. The recipe is still usefull because clarity and efficency is better than one liners in many cases.

Damon McCormick 15 years, 9 months ago  # | flag

if you don't mind a level of indirection... ...you can also implement partition using group.

def partition (iterable, func):
    return group((i, func(i)) for i in iterable)
Created by David Dai on Wed, 25 Oct 2006 (PSF)
Python recipes (4591)
David Dai's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks