Welcome, guest | Sign In | My Account | Store | Cart

Quick and dirty script to embed unsynchronized lyrics or any other text into MP3 files. The text files with the lyrics are expected to be in the same folder: i.e. for MySong.mp3 the lyrics text should be in the file MySong.txt.

The encoding of the text file will be probed in the following order: 'utf8','iso-8859-1','iso-8859-15','cp1252','cp1251','latin1'. If you need support for more encodings, a list is available at http://docs.python.org/release/2.5.2/lib/standard-encodings.html

To see the lyrics on an iPod (tested on 6G Classic) you need to press the middle button four times while a song is playing.

The script can also be used to set other ID3 tags. By default SET_OTHER_ID3_TAGS is False so existing ID3 tags will NOT be overwritten.

Usage: Running the file without arguments will process all MP3 files in the current directory.

Alternatively the path to the folder with MP3's can be passed as the first argument.

Python, 83 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import sys
import codecs
from mutagen.mp3 import MP3
from mutagen.id3 import ID3NoHeaderError
from mutagen.id3 import ID3, TIT2, TALB, TPE1, TPE2, COMM, USLT, TCOM, TCON, TDRC

TEXT_ENCODING = 'utf8'
SET_OTHER_ID3_TAGS = False

# get workdir from first arg or use current dir 
if (len(sys.argv) > 1):
    fpath = sys.argv[1]
else:
    fpath = os.path.abspath(os.path.dirname(sys.argv[0]))

for fn in os.listdir(fpath):

    fname = os.path.join(fpath, fn)
    if fname.lower().endswith('.mp3'):
        
        lyrics = None
        
        lyrfname = fname[:-3] + 'txt'

        if not os.path.exists(lyrfname):
            print '\tERROR: No lyrics file found:', lyrfname, '...skipping'
            continue
        else:
            lyrics = open(lyrfname).read().strip()

        # try to find the right encoding
        for enc in ('utf8','iso-8859-1','iso-8859-15','cp1252','cp1251','latin1'):
            try:
                lyrics = lyrics.decode(enc)
                print enc,
                break
            except:
                pass
        
        # create ID3 tag if not exists
        try: 
            tags = ID3(fname)
        except ID3NoHeaderError:
            print "Adding ID3 header;",
            tags = ID3()

        # remove old unsychronized lyrics
        if len(tags.getall(u"USLT::'en'")) != 0:
            print "Removing Lyrics."
            tags.delall(u"USLT::'en'")
            tags.save(fname)
            
        #tags.add(USLT(encoding=3, lang=u'eng', desc=u'desc', text=lyrics))
        # apparently the description is important when more than one 
        # USLT frames are present
        tags[u"USLT::'eng'"] = (USLT(encoding=3, lang=u'eng', desc=u'desc', text=lyrics))
        print 'Added USLT frame to', fn
        
        # set title from filename; adjust to your needs
        if SET_OTHER_ID3_TAGS:
            title = unicode(os.path.splitext ( os.path.split(fn)[-1])[0])
            print title, 
            print fname
            # title
            tags["TIT2"] = TIT2(encoding=3, text= title)
            tags["TALB"] = TALB(encoding=3, text= u'mutagen Album Name')
            tags["TPE2"] = TPE2(encoding=3, text= u'mutagen Band')
            tags["COMM"] = COMM(encoding=3, lang=u'eng', desc='desc', text=u'mutagen comment')
            # artist
            tags["TPE1"] = TPE1(encoding=3, text= u'mutagen Artist')
            # composer 
            tags["TCOM"] = TCOM(encoding=3, text= u'mutagen Composer')
            # genre
            tags["TCON"] = TCON(encoding=3, text= u'mutagen Genre')
            tags["TDRC"] = TDRC(encoding=3, text= u'2010')
            #use to set track number
            #tags["TRCK"] = COMM(encoding=3, text=track_number)
        tags.save(fname)

print 'Done'

2 comments

Gabriel Genellina 14 years ago  # | flag
The encoding of the text file will be probed in the following order: 
'utf8','latin1','iso-8859-1','iso-8859-15','cp1252','cp1251'.

Note that every byte sequence is valid latin-1, so the search stops there.

Try decodeh: http://gizmojo.org/code/decodeh/

ccpizza (author) 13 years, 4 months ago  # | flag

Gabriel, Thanks for the note about the order. I tried using the chardet module for detecting the encoding but it did not work well on all files. At least it failed to reliably detect the proper encoding for Spanish.