Handling multipart/form-data in Python

Programming
Programming

I need to generate multipart/form-data (see here) messages from Python. Never mind why. I dug around in the documentation for httplib, urllib, and urllib2, but it seems this is not currently supported (it’s Issue 3244). I didn’t like the code I found on the web to do it, because I needed to set additional headers on each piece. So… I wrote something. Here it is. If it’s useful to you, great! If you find bugs in it, please let me know. I think this is pretty easy to use.

Make an instance of Multipart, and then add parts using the field and file methods.

>>> from multipart import Multipart
>>> m = Multipart()
>>> m.field('search','searchish term')
>>> m.file('greet','greet.txt','Hello multipart world!',{'Content-Type':'text/text'})
>>> ct,body = m.get()
>>> print ct
multipart/form-data; boundary=----------AaB03x
>>> print body
------------AaB03x
Content-Type: application/octet-stream
Content-Disposition: form-data; name="search"
 
searchish term
------------AaB03x
Content-Type: text/text
Content-Disposition: form-data; name="greet"; filename="greet.txt"
 
Hello multipart world!
------------AaB03x--

If no content type is specified for a value, then the default of application/octet-stream is chosen. If none is specified for a file, then the mime libraries are consulted to guess one based on the filename, and again the default is application/octet-stream if none can be guessed.

If you want to specify the content type, make sure to use the string ‘Content-Type’ (note the caps) or it will be ignored. I’ve seen the capitalization all over the map, but needed to choose one to use in the case-sensitive dict. If you don’t like my choice… change the code. See the constants at the start of the Client class.

Here’s how you might continue the above example to send this out in an HTTP request.

request = urllib2.Request(url='http://my.fake.server',
    headers={'Content-Type':ct},
    data=body)
reply = urllib2.urlopen(request)
print reply.read()

Finally, here’s the code. Enjoy! This comes, of course, with no warranty, use at your own risk, etc.

#!/usr/bin/python
'''
Classes for using multipart form data from Python, which does not (at the
time of writing) support this directly.
 
To use this, make an instance of Multipart and add parts to it via the factory
methods field and file.  When you are done, get the content via the get method.
 
@author: Stacy Prowell (http://stacyprowell.com)
'''
 
import mimetypes
 
class Part(object):
    '''
    Class holding a single part of the form.  You should never need to use
    this class directly; instead, use the factory methods in Multipart:
    field and file.
    '''
 
    # The boundary to use.  This is shamelessly taken from the standard.
    BOUNDARY = '----------AaB03x'
    CRLF = '\r\n'
    # Common headers.
    CONTENT_TYPE = 'Content-Type'
    CONTENT_DISPOSITION = 'Content-Disposition'
    # The default content type for parts.
    DEFAULT_CONTENT_TYPE = 'application/octet-stream'
 
    def __init__(self, name, filename, body, headers):
        '''
        Make a new part.  The part will have the given headers added initially.
 
        @param name: The part name.
        @type name: str
        @param filename: If this is a file, the name of the file.  Otherwise
                        None.
        @type filename: str
        @param body: The body of the part.
        @type body: str
        @param headers: Additional headers, or overrides, for this part.
                        You can override Content-Type here.
        @type headers: dict
        '''
        self._headers = headers.copy()
        self._name = name
        self._filename = filename
        self._body = body
        # We respect any content type passed in, but otherwise set it here.
        # We set the content disposition now, overwriting any prior value.
        if self._filename == None:
            self._headers[Part.CONTENT_DISPOSITION] = \
                ('form-data; name="%s"' % self._name)
            self._headers.setdefault(Part.CONTENT_TYPE,
                                     Part.DEFAULT_CONTENT_TYPE)
        else:
            self._headers[Part.CONTENT_DISPOSITION] = \
                ('form-data; name="%s"; filename="%s"' %
                 (self._name, self._filename))
            self._headers.setdefault(Part.CONTENT_TYPE,
                                     mimetypes.guess_type(filename)[0]
                                     or Part.DEFAULT_CONTENT_TYPE)
        return
 
    def get(self):
        '''
        Convert the part into a list of lines for output.  This includes
        the boundary lines, part header lines, and the part itself.  A
        blank line is included between the header and the body.
 
        @return: Lines of this part.
        @rtype: list
        '''
        lines = []
        lines.append('--' + Part.BOUNDARY)
        for (key, val) in self._headers.items():
            lines.append('%s: %s' % (key, val))
        lines.append('')
        lines.append(self._body)
        return lines
 
class Multipart(object):
    '''
    Encapsulate multipart form data.  To use this, make an instance and then
    add parts to it via the two methods (field and file).  When done, you can
    get the result via the get method.
 
    See http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2 for
    details on multipart/form-data.
 
    Watch http://bugs.python.org/issue3244 to see if this is fixed in the
    Python libraries.
 
    @return: content type, body
    @rtype: tuple
    '''
 
    def __init__(self):
        self.parts = []
        return
 
    def field(self, name, value, headers={}):
        '''
        Create and append a field part.  This kind of part has a field name
        and value.
 
        @param name: The field name.
        @type name: str
        @param value: The field value.
        @type value: str
        @param headers: Headers to set in addition to disposition.
        @type headers: dict
        '''
        self.parts.append(Part(name, None, value, headers))
        return
 
    def file(self, name, filename, value, headers={}):
        '''
        Create and append a file part.  THis kind of part has a field name,
        a filename, and a value.
 
        @param name: The field name.
        @type name: str
        @param value: The field value.
        @type value: str
        @param headers: Headers to set in addition to disposition.
        @type headers: dict
        '''
        self.parts.append(Part(name, filename, value, headers))
        return
 
    def get(self):
        '''
        Get the multipart form data.  This returns the content type, which
        specifies the boundary marker, and also returns the body containing
        all parts and bondary markers.
 
        @return: content type, body
        @rtype: tuple
        '''
        all = []
        for part in self.parts:
            all += part.get()
        all.append('--' + Part.BOUNDARY + '--')
        all.append('')
        # We have to return the content type, since it specifies the boundary.
        content_type = 'multipart/form-data; boundary=%s' % Part.BOUNDARY
        return content_type, Part.CRLF.join(all)

9 Replies to “Handling multipart/form-data in Python”

  1. Hi, your post inspired me to solve a big problem : know if an iframe is loaded.
    My solution for download file with iframe, and fire an event with browser that does not support onload event is to give a multipart response.
    the first part is the file, and the second is anything you want, either a ‘special’ string or an html page that contains a .
    In my case, I just try if the Iframe contains the string ‘download_done’ every 100ms… Another dirty hack šŸ˜‰

    my simple class :

    #-*- coding: utf-8 -*-

    from hashlib import sha256
    import random
    import time

    CRLF = ‘\r\n’

    class MultipartResponse(object):

    def __init__(self, multipart=’mixed’, boundary=None):
    if boundary is None:
    random.seed(time.time())
    rand = random.randint(1000,999999999)
    self.boundary = sha256(str(rand)).hexdigest()
    self.content_type = ‘multipart/’+multipart+’; boundary=%s’ % self.boundary
    self.parts = []

    def _addPart(self, data, headers):
    part = [‘–‘ + self.boundary] + headers + [”] + [data]
    self.parts.append(CRLF.join(part))

    def addText(self, text, charset=”utf-8″):
    content_type = “Content-Type: text/plain; charset=%s” % charset
    self._addPart(text, [content_type])

    def addHTML(self, html, charset=”utf-8″):
    content_type = “Content-Type: text/html; charset=%s” % charset
    self._addPart(html, [content_type])

    def addFile(self, data, mime_type, filename, charset=”utf-8″):
    content_type = “Content-Type: %s; charset=%s” % (mime_type, charset)
    content_disposition = “Content-disposition: attachment; filename=%s” % filename
    self._addPart(data, [content_type, content_disposition])

    def render(self):
    all = self.parts + [‘–‘ + self.boundary + ‘–‘] + [”]
    return CRLF.join(all)

    You can use it easely :

    multipart = Multipart()
    multipart.addFile(“pif;paf;pouf”, ‘text/csv’, ‘export.csv’) # your data here !
    multipart.addText(‘download_done’)
    response.setHeader(‘Content-Type’, multipart.content_type)
    return multipart.render()

  2. [code]
    #-*- coding: utf-8 -*-

    from hashlib import sha256
    import random
    import time

    CRLF = ‘\r\n’

    class MultipartResponse(object):

    def __init__(self, multipart=’mixed’, boundary=None):
    if boundary is None:
    random.seed(time.time())
    rand = random.randint(1000,999999999)
    self.boundary = sha256(str(rand)).hexdigest()
    self.content_type = ‘multipart/’+multipart+’; boundary=%s’ % self.boundary
    self.parts = []

    def _addPart(self, data, headers):
    part = [‘–‘ + self.boundary] + headers + [”] + [data]
    self.parts.append(CRLF.join(part))

    def addText(self, text, charset=”utf-8″):
    content_type = “Content-Type: text/plain; charset=%s” % charset
    self._addPart(text, [content_type])

    def addHTML(self, html, charset=”utf-8″):
    content_type = “Content-Type: text/html; charset=%s” % charset
    self._addPart(html, [content_type])

    def addFile(self, data, mime_type, filename, charset=”utf-8″):
    content_type = “Content-Type: %s; charset=%s” % (mime_type, charset)
    content_disposition = “Content-disposition: attachment; filename=%s” % filename
    self._addPart(data, [content_type, content_disposition])

    def render(self):
    all = self.parts + [‘–‘ + self.boundary + ‘–‘] + [”]
    return CRLF.join(all)
    [code]

  3. demikaze: I add it because of the python formatter / processor I use. It makes it work correctly, and it also causes the Eclipse environment to correctly indent, etc. In short, yes, it is needed (by me, not the interpreter).

  4. hi, is there a reason why this might not work on python under windows? it looks like it doesn’t write all the parts. just wanted to check if anyone faced something similar. thanks!

  5. When we tried the Code it is working fine for the TXT file, however when we are trying to send the image file.it transmits the file to the server however the file is corrupted not JPEG.Not sure how the file path should be provided in the code so that it can pick that file and transmit.Any Suggestions will be fine.

Leave a Reply

Your email address will not be published. Required fields are marked *