
Programming
I need to generate multipart/form-data (see here) messages from Python. Never mind why. I dug around in the documentation for httplib, urllib, and urllib2, but it seems this is not currently supported (it’s Issue 3244). I didn’t like the code I found on the web to do it, because I needed to set additional headers on each piece. So… I wrote something. Here it is. If it’s useful to you, great! If you find bugs in it, please let me know. I think this is pretty easy to use.
Make an instance of Multipart, and then add parts using the field and file methods.
>>> from multipart import Multipart
>>> m = Multipart()
>>> m.field('search','searchish term')
>>> m.file('greet','greet.txt','Hello multipart world!',{'Content-Type':'text/text'})
>>> ct,body = m.get()
>>> print ct
multipart/form-data; boundary=----------AaB03x
>>> print body
------------AaB03x
Content-Type: application/octet-stream
Content-Disposition: form-data; name="search"
searchish term
------------AaB03x
Content-Type: text/text
Content-Disposition: form-data; name="greet"; filename="greet.txt"
Hello multipart world!
------------AaB03x--If no content type is specified for a value, then the default of application/octet-stream is chosen. If none is specified for a file, then the mime libraries are consulted to guess one based on the filename, and again the default is application/octet-stream if none can be guessed.
If you want to specify the content type, make sure to use the string ‘Content-Type’ (note the caps) or it will be ignored. I’ve seen the capitalization all over the map, but needed to choose one to use in the case-sensitive dict. If you don’t like my choice… change the code. See the constants at the start of the Client class.
Here’s how you might continue the above example to send this out in an HTTP request.
request = urllib2.Request(url='http://my.fake.server', headers={'Content-Type':ct}, data=body) reply = urllib2.urlopen(request) print reply.read()
Finally, here’s the code. Enjoy! This comes, of course, with no warranty, use at your own risk, etc.
#!/usr/bin/python ''' Classes for using multipart form data from Python, which does not (at the time of writing) support this directly. To use this, make an instance of Multipart and add parts to it via the factory methods field and file. When you are done, get the content via the get method. @author: Stacy Prowell (http://stacyprowell.com) ''' import mimetypes class Part(object): ''' Class holding a single part of the form. You should never need to use this class directly; instead, use the factory methods in Multipart: field and file. ''' # The boundary to use. This is shamelessly taken from the standard. BOUNDARY = '----------AaB03x' CRLF = '\r\n' # Common headers. CONTENT_TYPE = 'Content-Type' CONTENT_DISPOSITION = 'Content-Disposition' # The default content type for parts. DEFAULT_CONTENT_TYPE = 'application/octet-stream' def __init__(self, name, filename, body, headers): ''' Make a new part. The part will have the given headers added initially. @param name: The part name. @type name: str @param filename: If this is a file, the name of the file. Otherwise None. @type filename: str @param body: The body of the part. @type body: str @param headers: Additional headers, or overrides, for this part. You can override Content-Type here. @type headers: dict ''' self._headers = headers.copy() self._name = name self._filename = filename self._body = body # We respect any content type passed in, but otherwise set it here. # We set the content disposition now, overwriting any prior value. if self._filename == None: self._headers[Part.CONTENT_DISPOSITION] = \ ('form-data; name="%s"' % self._name) self._headers.setdefault(Part.CONTENT_TYPE, Part.DEFAULT_CONTENT_TYPE) else: self._headers[Part.CONTENT_DISPOSITION] = \ ('form-data; name="%s"; filename="%s"' % (self._name, self._filename)) self._headers.setdefault(Part.CONTENT_TYPE, mimetypes.guess_type(filename)[0] or Part.DEFAULT_CONTENT_TYPE) return def get(self): ''' Convert the part into a list of lines for output. This includes the boundary lines, part header lines, and the part itself. A blank line is included between the header and the body. @return: Lines of this part. @rtype: list ''' lines = [] lines.append('--' + Part.BOUNDARY) for (key, val) in self._headers.items(): lines.append('%s: %s' % (key, val)) lines.append('') lines.append(self._body) return lines class Multipart(object): ''' Encapsulate multipart form data. To use this, make an instance and then add parts to it via the two methods (field and file). When done, you can get the result via the get method. See http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2 for details on multipart/form-data. Watch http://bugs.python.org/issue3244 to see if this is fixed in the Python libraries. @return: content type, body @rtype: tuple ''' def __init__(self): self.parts = [] return def field(self, name, value, headers={}): ''' Create and append a field part. This kind of part has a field name and value. @param name: The field name. @type name: str @param value: The field value. @type value: str @param headers: Headers to set in addition to disposition. @type headers: dict ''' self.parts.append(Part(name, None, value, headers)) return def file(self, name, filename, value, headers={}): ''' Create and append a file part. THis kind of part has a field name, a filename, and a value. @param name: The field name. @type name: str @param value: The field value. @type value: str @param headers: Headers to set in addition to disposition. @type headers: dict ''' self.parts.append(Part(name, filename, value, headers)) return def get(self): ''' Get the multipart form data. This returns the content type, which specifies the boundary marker, and also returns the body containing all parts and bondary markers. @return: content type, body @rtype: tuple ''' all = [] for part in self.parts: all += part.get() all.append('--' + Part.BOUNDARY + '--') all.append('') # We have to return the content type, since it specifies the boundary. content_type = 'multipart/form-data; boundary=%s' % Part.BOUNDARY return content_type, Part.CRLF.join(all)

August 14th, 2009 at 1:16 pm
Hi, your post inspired me to solve a big problem : know if an iframe is loaded.
My solution for download file with iframe, and fire an event with browser that does not support onload event is to give a multipart response.
the first part is the file, and the second is anything you want, either a ’special’ string or an html page that contains a .
In my case, I just try if the Iframe contains the string ‘download_done’ every 100ms… Another dirty hack
my simple class :
#-*- coding: utf-8 -*-
from hashlib import sha256
import random
import time
CRLF = ‘\r\n’
class MultipartResponse(object):
def __init__(self, multipart=’mixed’, boundary=None):
if boundary is None:
random.seed(time.time())
rand = random.randint(1000,999999999)
self.boundary = sha256(str(rand)).hexdigest()
self.content_type = ‘multipart/’+multipart+’; boundary=%s’ % self.boundary
self.parts = []
def _addPart(self, data, headers):
part = ['--' + self.boundary] + headers + [''] + [data]
self.parts.append(CRLF.join(part))
def addText(self, text, charset=”utf-8″):
content_type = “Content-Type: text/plain; charset=%s” % charset
self._addPart(text, [content_type])
def addHTML(self, html, charset=”utf-8″):
content_type = “Content-Type: text/html; charset=%s” % charset
self._addPart(html, [content_type])
def addFile(self, data, mime_type, filename, charset=”utf-8″):
content_type = “Content-Type: %s; charset=%s” % (mime_type, charset)
content_disposition = “Content-disposition: attachment; filename=%s” % filename
self._addPart(data, [content_type, content_disposition])
def render(self):
all = self.parts + ['--' + self.boundary + '--'] + ['']
return CRLF.join(all)
You can use it easely :
multipart = Multipart()
multipart.addFile(“pif;paf;pouf”, ‘text/csv’, ‘export.csv’) # your data here !
multipart.addText(‘download_done’)
response.setHeader(‘Content-Type’, multipart.content_type)
return multipart.render()
August 14th, 2009 at 1:17 pm
[code]
#-*- coding: utf-8 -*-
from hashlib import sha256
import random
import time
CRLF = '\r\n'
class MultipartResponse(object):
def __init__(self, multipart='mixed', boundary=None):
if boundary is None:
random.seed(time.time())
rand = random.randint(1000,999999999)
self.boundary = sha256(str(rand)).hexdigest()
self.content_type = 'multipart/'+multipart+'; boundary=%s' % self.boundary
self.parts = []
def _addPart(self, data, headers):
part = ['--' + self.boundary] + headers + [''] + [data]
self.parts.append(CRLF.join(part))
def addText(self, text, charset="utf-8"):
content_type = "Content-Type: text/plain; charset=%s" % charset
self._addPart(text, [content_type])
def addHTML(self, html, charset="utf-8"):
content_type = "Content-Type: text/html; charset=%s" % charset
self._addPart(html, [content_type])
def addFile(self, data, mime_type, filename, charset="utf-8"):
content_type = "Content-Type: %s; charset=%s" % (mime_type, charset)
content_disposition = "Content-disposition: attachment; filename=%s" % filename
self._addPart(data, [content_type, content_disposition])
def render(self):
all = self.parts + ['--' + self.boundary + '--'] + ['']
return CRLF.join(all)
[code]
March 6th, 2010 at 11:33 pm
Cool Thanks for this blog. I am new at development and this is a big help.
May 12th, 2010 at 11:45 am
‘return’ without parameters is not needed
May 12th, 2010 at 6:09 pm
demikaze: I add it because of the python formatter / processor I use. It makes it work correctly, and it also causes the Eclipse environment to correctly indent, etc. In short, yes, it is needed (by me, not the interpreter).