08 August 2023
Inverting parse_qs in Python
Incorrectly escaping characters is one of the more common errors I see developers run into. If you’re parsing, manipulating, or serializing URLs in code, you should try to use standard libraries that cover a lot of these nuances for you.
Python 3 has urllib.parse. That page covers really all you need to know about it, but sometimes one might think they can find an answer without reading the docs. For me, I had parsed a query string using parse_qs
and was looking for the inverse function to turn such an object back into a string. Here’s the answer…
Inverting parse_qs
Parse a query string with parse_qs
. This function expects no "?":
from urllib.parse import parse_qs
parse_qs('foo=bar&foo=baz&bing=bong')
# {'foo': ['bar', 'baz'], 'bing': ['bong']}
Wrong way: convert it back using urlencode
(it serializes each array of values as a single value):
from urllib.parse import urlencode
urlencode({'foo': ['bar', 'baz'], 'bing': ['bong']})
# 'foo=%5B%27bar%27%2C+%27baz%27%5D&bing=%5B%27bong%27%5D'
Correct way: convert it back using urlencode
with doseq=True
:
from urllib.parse import urlencode
urlencode({'foo': ['bar', 'baz'], 'bing': ['bong']}, doseq=True)
# 'foo=bar&foo=baz&bing=bong'
Via help(urlencode)
:
urlencode(query, doseq=False, safe='', encoding=None, errors=None, quote_via=<function quote_plus>)
Encode a dict or sequence of two-element tuples into a URL query string.
If any values in the query arg are sequences and doseq is true, each sequence element is converted to a separate parameter.
Further reading
-
urllib.parse.parse_qsl - this alternative to
parse_qs
makes a list of tuples instead of a dict, which means duplicate keys can safely exist at the top level. As a result, usingurlencode
with this format works as you’d expect regardless of whether you setdoseq
or not, since none of the values in the tuples are sequences. -
urllib.parse.urlparse - this is a nice way to parse an entire URL. The
ParseResult
object is immutable, but you can use_replace
andgeturl
to create a new URL:parsed._replace(query=new_query).geturl()
-
URLSearchParams - if you’re working with query strings in the browser, then this JavaScript interface is your friend and it has good browser support too.
-
mrcoles.com/urlparse - an old blog post where you can paste in any URL and it will pretty-print the URL, query string, and hash separately.