Home | History | Annotate | Download | only in Lib
      1 """Stuff to parse Sun and NeXT audio files.
      2 
      3 An audio file consists of a header followed by the data.  The structure
      4 of the header is as follows.
      5 
      6         +---------------+
      7         | magic word    |
      8         +---------------+
      9         | header size   |
     10         +---------------+
     11         | data size     |
     12         +---------------+
     13         | encoding      |
     14         +---------------+
     15         | sample rate   |
     16         +---------------+
     17         | # of channels |
     18         +---------------+
     19         | info          |
     20         |               |
     21         +---------------+
     22 
     23 The magic word consists of the 4 characters '.snd'.  Apart from the
     24 info field, all header fields are 4 bytes in size.  They are all
     25 32-bit unsigned integers encoded in big-endian byte order.
     26 
     27 The header size really gives the start of the data.
     28 The data size is the physical size of the data.  From the other
     29 parameters the number of frames can be calculated.
     30 The encoding gives the way in which audio samples are encoded.
     31 Possible values are listed below.
     32 The info field currently consists of an ASCII string giving a
     33 human-readable description of the audio file.  The info field is
     34 padded with NUL bytes to the header size.
     35 
     36 Usage.
     37 
     38 Reading audio files:
     39         f = sunau.open(file, 'r')
     40 where file is either the name of a file or an open file pointer.
     41 The open file pointer must have methods read(), seek(), and close().
     42 When the setpos() and rewind() methods are not used, the seek()
     43 method is not  necessary.
     44 
     45 This returns an instance of a class with the following public methods:
     46         getnchannels()  -- returns number of audio channels (1 for
     47                            mono, 2 for stereo)
     48         getsampwidth()  -- returns sample width in bytes
     49         getframerate()  -- returns sampling frequency
     50         getnframes()    -- returns number of audio frames
     51         getcomptype()   -- returns compression type ('NONE' or 'ULAW')
     52         getcompname()   -- returns human-readable version of
     53                            compression type ('not compressed' matches 'NONE')
     54         getparams()     -- returns a tuple consisting of all of the
     55                            above in the above order
     56         getmarkers()    -- returns None (for compatibility with the
     57                            aifc module)
     58         getmark(id)     -- raises an error since the mark does not
     59                            exist (for compatibility with the aifc module)
     60         readframes(n)   -- returns at most n frames of audio
     61         rewind()        -- rewind to the beginning of the audio stream
     62         setpos(pos)     -- seek to the specified position
     63         tell()          -- return the current position
     64         close()         -- close the instance (make it unusable)
     65 The position returned by tell() and the position given to setpos()
     66 are compatible and have nothing to do with the actual position in the
     67 file.
     68 The close() method is called automatically when the class instance
     69 is destroyed.
     70 
     71 Writing audio files:
     72         f = sunau.open(file, 'w')
     73 where file is either the name of a file or an open file pointer.
     74 The open file pointer must have methods write(), tell(), seek(), and
     75 close().
     76 
     77 This returns an instance of a class with the following public methods:
     78         setnchannels(n) -- set the number of channels
     79         setsampwidth(n) -- set the sample width
     80         setframerate(n) -- set the frame rate
     81         setnframes(n)   -- set the number of frames
     82         setcomptype(type, name)
     83                         -- set the compression type and the
     84                            human-readable compression type
     85         setparams(tuple)-- set all parameters at once
     86         tell()          -- return current position in output file
     87         writeframesraw(data)
     88                         -- write audio frames without pathing up the
     89                            file header
     90         writeframes(data)
     91                         -- write audio frames and patch up the file header
     92         close()         -- patch up the file header and close the
     93                            output file
     94 You should set the parameters before the first writeframesraw or
     95 writeframes.  The total number of frames does not need to be set,
     96 but when it is set to the correct value, the header does not have to
     97 be patched up.
     98 It is best to first set all parameters, perhaps possibly the
     99 compression type, and then write audio frames using writeframesraw.
    100 When all frames have been written, either call writeframes('') or
    101 close() to patch up the sizes in the header.
    102 The close() method is called automatically when the class instance
    103 is destroyed.
    104 """
    105 
    106 # from <multimedia/audio_filehdr.h>
    107 AUDIO_FILE_MAGIC = 0x2e736e64
    108 AUDIO_FILE_ENCODING_MULAW_8 = 1
    109 AUDIO_FILE_ENCODING_LINEAR_8 = 2
    110 AUDIO_FILE_ENCODING_LINEAR_16 = 3
    111 AUDIO_FILE_ENCODING_LINEAR_24 = 4
    112 AUDIO_FILE_ENCODING_LINEAR_32 = 5
    113 AUDIO_FILE_ENCODING_FLOAT = 6
    114 AUDIO_FILE_ENCODING_DOUBLE = 7
    115 AUDIO_FILE_ENCODING_ADPCM_G721 = 23
    116 AUDIO_FILE_ENCODING_ADPCM_G722 = 24
    117 AUDIO_FILE_ENCODING_ADPCM_G723_3 = 25
    118 AUDIO_FILE_ENCODING_ADPCM_G723_5 = 26
    119 AUDIO_FILE_ENCODING_ALAW_8 = 27
    120 
    121 # from <multimedia/audio_hdr.h>
    122 AUDIO_UNKNOWN_SIZE = 0xFFFFFFFFL        # ((unsigned)(~0))
    123 
    124 _simple_encodings = [AUDIO_FILE_ENCODING_MULAW_8,
    125                      AUDIO_FILE_ENCODING_LINEAR_8,
    126                      AUDIO_FILE_ENCODING_LINEAR_16,
    127                      AUDIO_FILE_ENCODING_LINEAR_24,
    128                      AUDIO_FILE_ENCODING_LINEAR_32,
    129                      AUDIO_FILE_ENCODING_ALAW_8]
    130 
    131 class Error(Exception):
    132     pass
    133 
    134 def _read_u32(file):
    135     x = 0L
    136     for i in range(4):
    137         byte = file.read(1)
    138         if byte == '':
    139             raise EOFError
    140         x = x*256 + ord(byte)
    141     return x
    142 
    143 def _write_u32(file, x):
    144     data = []
    145     for i in range(4):
    146         d, m = divmod(x, 256)
    147         data.insert(0, m)
    148         x = d
    149     for i in range(4):
    150         file.write(chr(int(data[i])))
    151 
    152 class Au_read:
    153 
    154     def __init__(self, f):
    155         if type(f) == type(''):
    156             import __builtin__
    157             f = __builtin__.open(f, 'rb')
    158         self.initfp(f)
    159 
    160     def __del__(self):
    161         if self._file:
    162             self.close()
    163 
    164     def initfp(self, file):
    165         self._file = file
    166         self._soundpos = 0
    167         magic = int(_read_u32(file))
    168         if magic != AUDIO_FILE_MAGIC:
    169             raise Error, 'bad magic number'
    170         self._hdr_size = int(_read_u32(file))
    171         if self._hdr_size < 24:
    172             raise Error, 'header size too small'
    173         if self._hdr_size > 100:
    174             raise Error, 'header size ridiculously large'
    175         self._data_size = _read_u32(file)
    176         if self._data_size != AUDIO_UNKNOWN_SIZE:
    177             self._data_size = int(self._data_size)
    178         self._encoding = int(_read_u32(file))
    179         if self._encoding not in _simple_encodings:
    180             raise Error, 'encoding not (yet) supported'
    181         if self._encoding in (AUDIO_FILE_ENCODING_MULAW_8,
    182                   AUDIO_FILE_ENCODING_ALAW_8):
    183             self._sampwidth = 2
    184             self._framesize = 1
    185         elif self._encoding == AUDIO_FILE_ENCODING_LINEAR_8:
    186             self._framesize = self._sampwidth = 1
    187         elif self._encoding == AUDIO_FILE_ENCODING_LINEAR_16:
    188             self._framesize = self._sampwidth = 2
    189         elif self._encoding == AUDIO_FILE_ENCODING_LINEAR_24:
    190             self._framesize = self._sampwidth = 3
    191         elif self._encoding == AUDIO_FILE_ENCODING_LINEAR_32:
    192             self._framesize = self._sampwidth = 4
    193         else:
    194             raise Error, 'unknown encoding'
    195         self._framerate = int(_read_u32(file))
    196         self._nchannels = int(_read_u32(file))
    197         self._framesize = self._framesize * self._nchannels
    198         if self._hdr_size > 24:
    199             self._info = file.read(self._hdr_size - 24)
    200             for i in range(len(self._info)):
    201                 if self._info[i] == '\0':
    202                     self._info = self._info[:i]
    203                     break
    204         else:
    205             self._info = ''
    206         try:
    207             self._data_pos = file.tell()
    208         except (AttributeError, IOError):
    209             self._data_pos = None
    210 
    211     def getfp(self):
    212         return self._file
    213 
    214     def getnchannels(self):
    215         return self._nchannels
    216 
    217     def getsampwidth(self):
    218         return self._sampwidth
    219 
    220     def getframerate(self):
    221         return self._framerate
    222 
    223     def getnframes(self):
    224         if self._data_size == AUDIO_UNKNOWN_SIZE:
    225             return AUDIO_UNKNOWN_SIZE
    226         if self._encoding in _simple_encodings:
    227             return self._data_size // self._framesize
    228         return 0                # XXX--must do some arithmetic here
    229 
    230     def getcomptype(self):
    231         if self._encoding == AUDIO_FILE_ENCODING_MULAW_8:
    232             return 'ULAW'
    233         elif self._encoding == AUDIO_FILE_ENCODING_ALAW_8:
    234             return 'ALAW'
    235         else:
    236             return 'NONE'
    237 
    238     def getcompname(self):
    239         if self._encoding == AUDIO_FILE_ENCODING_MULAW_8:
    240             return 'CCITT G.711 u-law'
    241         elif self._encoding == AUDIO_FILE_ENCODING_ALAW_8:
    242             return 'CCITT G.711 A-law'
    243         else:
    244             return 'not compressed'
    245 
    246     def getparams(self):
    247         return self.getnchannels(), self.getsampwidth(), \
    248                   self.getframerate(), self.getnframes(), \
    249                   self.getcomptype(), self.getcompname()
    250 
    251     def getmarkers(self):
    252         return None
    253 
    254     def getmark(self, id):
    255         raise Error, 'no marks'
    256 
    257     def readframes(self, nframes):
    258         if self._encoding in _simple_encodings:
    259             if nframes == AUDIO_UNKNOWN_SIZE:
    260                 data = self._file.read()
    261             else:
    262                 data = self._file.read(nframes * self._framesize)
    263             self._soundpos += len(data) // self._framesize
    264             if self._encoding == AUDIO_FILE_ENCODING_MULAW_8:
    265                 import audioop
    266                 data = audioop.ulaw2lin(data, self._sampwidth)
    267             return data
    268         return None             # XXX--not implemented yet
    269 
    270     def rewind(self):
    271         if self._data_pos is None:
    272             raise IOError('cannot seek')
    273         self._file.seek(self._data_pos)
    274         self._soundpos = 0
    275 
    276     def tell(self):
    277         return self._soundpos
    278 
    279     def setpos(self, pos):
    280         if pos < 0 or pos > self.getnframes():
    281             raise Error, 'position not in range'
    282         if self._data_pos is None:
    283             raise IOError('cannot seek')
    284         self._file.seek(self._data_pos + pos * self._framesize)
    285         self._soundpos = pos
    286 
    287     def close(self):
    288         self._file = None
    289 
    290 class Au_write:
    291 
    292     def __init__(self, f):
    293         if type(f) == type(''):
    294             import __builtin__
    295             f = __builtin__.open(f, 'wb')
    296         self.initfp(f)
    297 
    298     def __del__(self):
    299         if self._file:
    300             self.close()
    301 
    302     def initfp(self, file):
    303         self._file = file
    304         self._framerate = 0
    305         self._nchannels = 0
    306         self._sampwidth = 0
    307         self._framesize = 0
    308         self._nframes = AUDIO_UNKNOWN_SIZE
    309         self._nframeswritten = 0
    310         self._datawritten = 0
    311         self._datalength = 0
    312         self._info = ''
    313         self._comptype = 'ULAW' # default is U-law
    314 
    315     def setnchannels(self, nchannels):
    316         if self._nframeswritten:
    317             raise Error, 'cannot change parameters after starting to write'
    318         if nchannels not in (1, 2, 4):
    319             raise Error, 'only 1, 2, or 4 channels supported'
    320         self._nchannels = nchannels
    321 
    322     def getnchannels(self):
    323         if not self._nchannels:
    324             raise Error, 'number of channels not set'
    325         return self._nchannels
    326 
    327     def setsampwidth(self, sampwidth):
    328         if self._nframeswritten:
    329             raise Error, 'cannot change parameters after starting to write'
    330         if sampwidth not in (1, 2, 4):
    331             raise Error, 'bad sample width'
    332         self._sampwidth = sampwidth
    333 
    334     def getsampwidth(self):
    335         if not self._framerate:
    336             raise Error, 'sample width not specified'
    337         return self._sampwidth
    338 
    339     def setframerate(self, framerate):
    340         if self._nframeswritten:
    341             raise Error, 'cannot change parameters after starting to write'
    342         self._framerate = framerate
    343 
    344     def getframerate(self):
    345         if not self._framerate:
    346             raise Error, 'frame rate not set'
    347         return self._framerate
    348 
    349     def setnframes(self, nframes):
    350         if self._nframeswritten:
    351             raise Error, 'cannot change parameters after starting to write'
    352         if nframes < 0:
    353             raise Error, '# of frames cannot be negative'
    354         self._nframes = nframes
    355 
    356     def getnframes(self):
    357         return self._nframeswritten
    358 
    359     def setcomptype(self, type, name):
    360         if type in ('NONE', 'ULAW'):
    361             self._comptype = type
    362         else:
    363             raise Error, 'unknown compression type'
    364 
    365     def getcomptype(self):
    366         return self._comptype
    367 
    368     def getcompname(self):
    369         if self._comptype == 'ULAW':
    370             return 'CCITT G.711 u-law'
    371         elif self._comptype == 'ALAW':
    372             return 'CCITT G.711 A-law'
    373         else:
    374             return 'not compressed'
    375 
    376     def setparams(self, params):
    377         nchannels, sampwidth, framerate, nframes, comptype, compname = params
    378         self.setnchannels(nchannels)
    379         self.setsampwidth(sampwidth)
    380         self.setframerate(framerate)
    381         self.setnframes(nframes)
    382         self.setcomptype(comptype, compname)
    383 
    384     def getparams(self):
    385         return self.getnchannels(), self.getsampwidth(), \
    386                   self.getframerate(), self.getnframes(), \
    387                   self.getcomptype(), self.getcompname()
    388 
    389     def tell(self):
    390         return self._nframeswritten
    391 
    392     def writeframesraw(self, data):
    393         self._ensure_header_written()
    394         if self._comptype == 'ULAW':
    395             import audioop
    396             data = audioop.lin2ulaw(data, self._sampwidth)
    397         nframes = len(data) // self._framesize
    398         self._file.write(data)
    399         self._nframeswritten = self._nframeswritten + nframes
    400         self._datawritten = self._datawritten + len(data)
    401 
    402     def writeframes(self, data):
    403         self.writeframesraw(data)
    404         if self._nframeswritten != self._nframes or \
    405                   self._datalength != self._datawritten:
    406             self._patchheader()
    407 
    408     def close(self):
    409         if self._file:
    410             try:
    411                 self._ensure_header_written()
    412                 if self._nframeswritten != self._nframes or \
    413                         self._datalength != self._datawritten:
    414                     self._patchheader()
    415                 self._file.flush()
    416             finally:
    417                 self._file = None
    418 
    419     #
    420     # private methods
    421     #
    422 
    423     def _ensure_header_written(self):
    424         if not self._nframeswritten:
    425             if not self._nchannels:
    426                 raise Error, '# of channels not specified'
    427             if not self._sampwidth:
    428                 raise Error, 'sample width not specified'
    429             if not self._framerate:
    430                 raise Error, 'frame rate not specified'
    431             self._write_header()
    432 
    433     def _write_header(self):
    434         if self._comptype == 'NONE':
    435             if self._sampwidth == 1:
    436                 encoding = AUDIO_FILE_ENCODING_LINEAR_8
    437                 self._framesize = 1
    438             elif self._sampwidth == 2:
    439                 encoding = AUDIO_FILE_ENCODING_LINEAR_16
    440                 self._framesize = 2
    441             elif self._sampwidth == 4:
    442                 encoding = AUDIO_FILE_ENCODING_LINEAR_32
    443                 self._framesize = 4
    444             else:
    445                 raise Error, 'internal error'
    446         elif self._comptype == 'ULAW':
    447             encoding = AUDIO_FILE_ENCODING_MULAW_8
    448             self._framesize = 1
    449         else:
    450             raise Error, 'internal error'
    451         self._framesize = self._framesize * self._nchannels
    452         _write_u32(self._file, AUDIO_FILE_MAGIC)
    453         header_size = 25 + len(self._info)
    454         header_size = (header_size + 7) & ~7
    455         _write_u32(self._file, header_size)
    456         if self._nframes == AUDIO_UNKNOWN_SIZE:
    457             length = AUDIO_UNKNOWN_SIZE
    458         else:
    459             length = self._nframes * self._framesize
    460         try:
    461             self._form_length_pos = self._file.tell()
    462         except (AttributeError, IOError):
    463             self._form_length_pos = None
    464         _write_u32(self._file, length)
    465         self._datalength = length
    466         _write_u32(self._file, encoding)
    467         _write_u32(self._file, self._framerate)
    468         _write_u32(self._file, self._nchannels)
    469         self._file.write(self._info)
    470         self._file.write('\0'*(header_size - len(self._info) - 24))
    471 
    472     def _patchheader(self):
    473         if self._form_length_pos is None:
    474             raise IOError('cannot seek')
    475         self._file.seek(self._form_length_pos)
    476         _write_u32(self._file, self._datawritten)
    477         self._datalength = self._datawritten
    478         self._file.seek(0, 2)
    479 
    480 def open(f, mode=None):
    481     if mode is None:
    482         if hasattr(f, 'mode'):
    483             mode = f.mode
    484         else:
    485             mode = 'rb'
    486     if mode in ('r', 'rb'):
    487         return Au_read(f)
    488     elif mode in ('w', 'wb'):
    489         return Au_write(f)
    490     else:
    491         raise Error, "mode must be 'r', 'rb', 'w', or 'wb'"
    492 
    493 openfp = open
    494