Welcome, guest | Sign In | My Account | Store | Cart

A globally unique identifier that combines ip, time, and random bits. Since the time is listed first, you can sort records by guid. You can also extract the time and ip if needed. GUIDs make wonderful database keys. They require no access to the database (to get the max index number), they are extremely unique, and they sort automatically by time.

Python, 183 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
#!/usr/bin/env python

# GUID.py
# Version 2.6
#
# Copyright (c) 2006 Conan C. Albrecht
#
# Permission is hereby granted, free of charge, to any person obtaining a copy 
# of this software and associated documentation files (the "Software"), to deal 
# in the Software without restriction, including without limitation the rights 
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 
# copies of the Software, and to permit persons to whom the Software is furnished 
# to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all 
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 
# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 
# PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 
# FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
# OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
# DEALINGS IN THE SOFTWARE.



##################################################################################################
###   A globally-unique identifier made up of time and ip and 8 digits for a counter: 
###   each GUID is 40 characters wide
###
###   A globally unique identifier that combines ip, time, and a counter.  Since the 
###   time is listed first, you can sort records by guid.  You can also extract the time 
###   and ip if needed.  
###
###   Since the counter has eight hex characters, you can create up to 
###   0xffffffff (4294967295) GUIDs every millisecond.  If your processor
###   is somehow fast enough to create more than that in a millisecond (looking
###   toward the future, of course), the function will wait until the next
###   millisecond to return.
###     
###   GUIDs make wonderful database keys.  They require no access to the 
###   database (to get the max index number), they are extremely unique, and they sort 
###   automatically by time.   GUIDs prevent key clashes when merging
###   two databases together, combining data, or generating keys in distributed
###   systems.
###
###   There is an Internet Draft for UUIDs, but this module does not implement it.
###   If the draft catches on, perhaps I'll conform the module to it.
###


# Changelog
# Sometime, 1997     Created the Java version of GUID
#                    Went through many versions in Java
# Sometime, 2002     Created the Python version of GUID, mirroring the Java version
# November 24, 2003  Changed Python version to be more pythonic, took out object and made just a module
# December 2, 2003   Fixed duplicating GUIDs.  Sometimes they duplicate if multiples are created
#                    in the same millisecond (it checks the last 100 GUIDs now and has a larger random part)
# December 9, 2003   Fixed MAX_RANDOM, which was going over sys.maxint
# June 12, 2004      Allowed a custom IP address to be sent in rather than always using the 
#                    local IP address.  
# November 4, 2005   Changed the random part to a counter variable.  Now GUIDs are totally 
#                    unique and more efficient, as long as they are created by only
#                    on runtime on a given machine.  The counter part is after the time
#                    part so it sorts correctly.
# November 8, 2005   The counter variable now starts at a random long now and cycles
#                    around.  This is in case two guids are created on the same
#                    machine at the same millisecond (by different processes).  Even though
#                    it is possible the GUID can be created, this makes it highly unlikely
#                    since the counter will likely be different.
# November 11, 2005  Fixed a bug in the new IP getting algorithm.  Also, use IPv6 range
#                    for IP when we make it up (when it's no accessible)
# November 21, 2005  Added better IP-finding code.  It finds IP address better now.
# January 5, 2006    Fixed a small bug caused in old versions of python (random module use)

import math
import socket
import random
import sys
import time
import threading



#############################
###   global module variables

#Makes a hex IP from a decimal dot-separated ip (eg: 127.0.0.1)
make_hexip = lambda ip: ''.join(["%04x" % long(i) for i in ip.split('.')]) # leave space for ip v6 (65K in each sub)
  
MAX_COUNTER = 0xfffffffe
counter = 0L
firstcounter = MAX_COUNTER
lasttime = 0
ip = ''
lock = threading.RLock()
try:  # only need to get the IP addresss once
  ip = socket.getaddrinfo(socket.gethostname(),0)[-1][-1][0]
  hexip = make_hexip(ip)
except: # if we don't have an ip, default to someting in the 10.x.x.x private range
  ip = '10'
  rand = random.Random()
  for i in range(3):
    ip += '.' + str(rand.randrange(1, 0xffff))  # might as well use IPv6 range if we're making it up
  hexip = make_hexip(ip)

  
#################################
###   Public module functions

def generate(ip=None):
  '''Generates a new guid.  A guid is unique in space and time because it combines
     the machine IP with the current time in milliseconds.  Be careful about sending in
     a specified IP address because the ip makes it unique in space.  You could send in
     the same IP address that is created on another machine.
  '''
  global counter, firstcounter, lasttime
  lock.acquire() # can't generate two guids at the same time
  try:
    parts = []

    # do we need to wait for the next millisecond (are we out of counters?)
    now = long(time.time() * 1000)
    while lasttime == now and counter == firstcounter: 
      time.sleep(.01)
      now = long(time.time() * 1000)

    # time part
    parts.append("%016x" % now)

    # counter part
    if lasttime != now:  # time to start counter over since we have a different millisecond
      firstcounter = long(random.uniform(1, MAX_COUNTER))  # start at random position
      counter = firstcounter
    counter += 1
    if counter > MAX_COUNTER:
      counter = 0
    lasttime = now
    parts.append("%08x" % (counter)) 

    # ip part
    parts.append(hexip)

    # put them all together
    return ''.join(parts)
  finally:
    lock.release()
    

def extract_time(guid):
  '''Extracts the time portion out of the guid and returns the 
     number of seconds since the epoch as a float'''
  return float(long(guid[0:16], 16)) / 1000.0


def extract_counter(guid):
  '''Extracts the counter from the guid (returns the bits in decimal)'''
  return int(guid[16:24], 16)


def extract_ip(guid):
  '''Extracts the ip portion out of the guid and returns it
     as a string like 10.10.10.10'''
  # there's probably a more elegant way to do this
  thisip = []
  for index in range(24, 40, 4):
    thisip.append(str(int(guid[index: index + 4], 16)))
  return '.'.join(thisip)



### TESTING OF GUID CLASS ###
if __name__ == "__main__":
  guids = []
  for i in range(10):  # calculate very fast so people can see the counter in action
    guid = generate()
    guids.append(guid)
  for guid in guids:
    print "GUID:", guid
    guidtime = extract_time(guid)
    print "\tTime:   ", time.strftime('%a, %d %b %Y %H:%M:%S', time.localtime(guidtime)), '(millis: ' + str(round(guidtime - long(guidtime), 3)) + ')'
    print "\tIP:     ", extract_ip(guid)
    print "\tCounter:", extract_counter(guid)
  

  

This version uses a counter instead of random bits and is released under the MIT license. It is faster and should make for better GUIDs. I did not incorporate using a class as is done in the example below because I like GUIDs to remain simple strings. However, you could easily modify this new version to be a class if you prefer that method.

16 comments

Rodrigo Oliveira 21 years, 5 months ago  # | flag

Very good but I would add format checking and __eq__... Here's a possibly improved version with basic GUID format checking, __eq__ implementation and a more pythonic ip implementation. Hope you like it.

thanks for sharing.

#!/usr/bin/python

# A globally unique identifier made up of time and ip
# Copyright (C) 2002  Dr. Conan C. Albrecht
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

import random
import socket
import time

class GUID:
  '''A globally-unique identifier made up of time and ip and 3 random digits: 35 characters wide

     A globally unique identifier that combines ip, time, and random bits.  Since the
     time is listed first, you can sort records by guid.  You can also extract the time
     and ip if needed.

     GUIDs make wonderful database keys.  They require no access to the
     database (to get the max index number), they are extremely unique, and they sort
     automatically by time.   GUIDs prevent key clashes when merging
     two databases together, combining data, or generating keys in distributed
     systems.
  '''
  rand = random.Random()
  ip = ''
  try:
    ip = socket.gethostbyname(socket.gethostname())
  except (socket.gaierror): # if we don't have an ip, default to someting in the 10.x.x.x private range
    ip = '10'
    for i in range(3):
      ip += '.' + str(rand.randrange(1, 254))
  hexip = ''.join(["%04x" % long(i) for i in ip.split('.')]) # leave space for ip v6 (65K in each sub)
  lastguid = ''

  def __init__(self, guid=None):
    '''Constructor.  Use no args if you want the guid generated (this is the normal method)
       or send a string-typed guid to generate it from the string'''
    if guid is None:
      self.guid = self.__class__.lastguid
      while self.guid == self.__class__.lastguid:
        # time part
        now = long(time.time() * 1000)
        self.guid = ("%016x" % now) + self.__class__.hexip
        # random part
        self.guid += ("%03x" % (self.__class__.rand.randrange(0, 4095)))
      self.__class__.lastguid = self.guid

(comment continued...)

Rodrigo Oliveira 21 years, 5 months ago  # | flag

(...continued from previous comment)

    elif type(guid) == type(self): # if a GUID object, copy its value
      self.guid = str(guid)

    else: # if a string, just save its value
      assert self._check(guid), guid + " is not a valid GUID!"
      self.guid = guid

  def __eq__(self, other):
      '''Return true if both GUID strings are equal'''
      if isinstance(other, self.__class__):
          return str(self) == str(other)
      return 0

  def __str__(self):
    '''Returns the string value of this guid'''
    return self.guid

  def time(self):
    '''Extracts the time portion out of the guid and returns the
       number of milliseconds since the epoch'''
    return long(self.guid[0:16], 16)

  def ip(self):
    '''Extracts the ip portion out of the guid and returns it
       as a string like 10.10.10.10'''
    # there's probably a more elegant way to do this
    ip = []
    index = 16
    while index Here's a possibly improved version with basic GUID format checking, __eq__ implementation and a more pythonic ip implementation. Hope you like it.


thanks for sharing.


<pre>
#!/usr/bin/python

# A globally unique identifier made up of time and ip
# Copyright (C) 2002  Dr. Conan C. Albrecht
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

import random
import socket
import time

class GUID:
  '''A globally-unique identifier made up of time and ip and 3 random digits: 35 characters wide

     A globally unique identifier that combines ip, time, and random bits.  Since the
     time is listed first, you can sort records by guid.  You can also extract the time
     and ip if needed.

(comment continued...)

Rodrigo Oliveira 21 years, 5 months ago  # | flag

(...continued from previous comment)

     GUIDs make wonderful database keys.  They require no access to the
     database (to get the max index number), they are extremely unique, and they sort
     automatically by time.   GUIDs prevent key clashes when merging
     two databases together, combining data, or generating keys in distributed
     systems.
  '''
  rand = random.Random()
  ip = ''
  try:
    ip = socket.gethostbyname(socket.gethostname())
  except (socket.gaierror): # if we don't have an ip, default to someting in the 10.x.x.x private range
    ip = '10'
    for i in range(3):
      ip += '.' + str(rand.randrange(1, 254))
  hexip = ''.join(["%04x" % long(i) for i in ip.split('.')]) # leave space for ip v6 (65K in each sub)
  lastguid = ''

  def __init__(self, guid=None):
    '''Constructor.  Use no args if you want the guid generated (this is the normal method)
       or send a string-typed guid to generate it from the string'''
    if guid is None:
      self.guid = self.__class__.lastguid
      while self.guid == self.__class__.lastguid:
        # time part
        now = long(time.time() * 1000)
        self.guid = ("%016x" % now) + self.__class__.hexip
        # random part
        self.guid += ("%03x" % (self.__class__.rand.randrange(0, 4095)))
      self.__class__.lastguid = self.guid

    elif type(guid) == type(self): # if a GUID object, copy its value
      self.guid = str(guid)

    else: # if a string, just save its value
      assert self._check(guid), guid + " is not a valid GUID!"
      self.guid = guid

  def __eq__(self, other):
      '''Return true if both GUID strings are equal'''
      if isinstance(other, self.__class__):
          return str(self) == str(other)
      return 0

  def __str__(self):
    '''Returns the string value of this guid'''
    return self.guid

  def time(self):
    '''Extracts the time portion out of the guid and returns the
       number of milliseconds since the epoch'''
    return long(self.guid[0:16], 16)

  def ip(self):
    '''Extracts the ip portion out of the guid and returns it
       as a string like 10.10.10.10'''
    # there's probably a more elegant way to do this
    ip = []
    index = 16
    while index

</pre>

Conan Albrecht (author) 21 years, 5 months ago  # | flag

Changes. Rodrigo -- Your code was cut off in the posting. Please send me the code to conan_albrecht@byu.edu and I'll integrate your changes back into the original and repost a new version. Thanks.

In fact, if others have more "pythonic" ways of doing things, let me know. I'm much more mature in other languages and relatively new to python.

Conan Albrecht (author) 20 years, 5 months ago  # | flag

Updated the code. I updated the code based upon reviews from many users. The main difference is the new GUID code simply generates a regular String object, rather than a GUID object. Why did I create a separate object? It seemed everyone was using GUIDs as strings anyway, so I removed the class and made functions instead.

The class now respects multiple threads, and it has been simplified somewhat.

Ulrich Hoffmann 20 years, 4 months ago  # | flag

int overflow under Python 2.2. Under Python 2.2 the phrase

import random; random.Random().randrange(0, 4294967296L)

fails with

Traceback (most recent call last):
  File "", line 1, in ?
  File "/usr/local/lib/python2.2/random.py", line 294, in randrange
    istop = int(stop)
OverflowError: long int too large to convert to int

while

import random; random.Random().randrange(0, 4294967296L/2-1)

succeeds.

Python 2.3 runs both successfully.

Maybe line 127

guid += ("%08x" % rand.randrange(0, 4294967296L)))

should better read

   guid += ("%08x" % rand.randrange(0, 2147483647L))<pre>
to enable Python cross-version compatibility.

</pre>

Robert Brewer 20 years, 4 months ago  # | flag

Alternate version using descending sequence instead of random. Instead of random bits, I use a descending sequence. This uses a simple generator to start that segment at sys.maxint, decrementing by one on each call. 2.2 or later.

import math, sys
def _unique_sequencer():
    _XUnit_sequence = sys.maxint
    while 1:
        yield _XUnit_sequence
        _XUnit_sequence -= 1
        if _XUnit_sequence &lt;= 0:
            _XUnit_sequence = sys.maxint
_uniqueid = _unique_sequencer()

def uniqueid(prefix=''):
    frac, secs = math.modf(time.time())
    days, remain = divmod(secs, 86400)
    id = _uniqueid.next()
    return u"%s%s%s_%s" % (prefix, hex(int(days))[2:],
                           hex(int(remain))[2:], hex(id)[2:])
Irmen de Jong 19 years, 11 months ago  # | flag

Another implementation with MIT-license is part of Pyro. For another GUID-generator, but licensed under a very liberal MIT software license, look in the Pyro.util package of Pyro. (http://pyro.sourceforge.net).

Oren Tirosh 18 years, 10 months ago  # | flag

Random GUIDs. 16 bytes from a cryptographic-quality random number source like os.urandom() are just as good as a method of generating GUIDs.

Conan Albrecht (author) 18 years, 5 months ago  # | flag

Why I don't use the class version. Rodrigo -- nice class version. I just modified my original code recipe so yours needs to be updated if you like my new changes. The reason I don't use a class as you suggest is because potentially thousands of GUIDs are loaded from the DB at a time. Your class constructor has to parse the GUID every time it is created, which adds processing time. In addition, you then convert it back to a string to compare, which adds more time. Keeping it as a string doesn't require any additional processing time at create. Yours could actually be modified to compare the raw numbers instead of the string representation (which would be more efficient), but I still think it's easier to keep it as a string from the start. In the end it's just preference.

Conan Albrecht (author) 18 years, 5 months ago  # | flag

Recipe now uses counter. The recipe above now uses a variation of your method here. I use an increasing counter so its easier to sort them (why use a decreasing counter?), but otherwise I think this idea is now in the main recipe.

Troy Kruthoff 18 years, 4 months ago  # | flag

Best way to store in PostgreSql. Is it possible to convert the string to a number value and store in a NUMERIC type field in PostgreSql?

Wai Yip Tung 18 years, 3 months ago  # | flag

If you have pywin32.

>>> import pythoncom
>>> print pythoncom.CreateGuid()
{600AFA7C-E537-424B-8EE2-A54A18102EFA}
Alex Greif 17 years, 7 months ago  # | flag

the ip arument in the generate() method is never used. could you please fix it?

Daryl Spitzer 17 years, 2 months ago  # | flag

"GUID.py:92: FutureWarning: hex/oct constants > sys.maxint..." I get "GUID.py:92: FutureWarning: hex/oct constants > sys.maxint will return positive values in Python 2.4 and up" from Python 2.3.5 (which is pre-installed on Mac OS X 10.4), and the counter seems to always be 0!

I changed "MAX_COUNTER = 0xfffffffe" to "MAX_COUNTER = sys.maxint" which fixes the problem.

George V. Reilly 15 years, 9 months ago  # | flag

Use the uuid module instead. An industrial-strength uuid module is included in python 2.5+. 2.3 and 2.4 users can get the module from http://zesty.ca/python/uuid.html