Designing Pythonic API's

Noam Elfanbaum

About me

Why are good package interfaces important?

  • The interface is the gateway for understanding what a package does.
  • Good interface are intuitive and easy to learn.
  • The interface cannot be changed too often.
  • Most importantly, great interfaces makes programing delightful!

What we'll do

  • Review key API differences between Kenneth Reitz popular Requests package and the standard library urllib in some typical HTTP usage scenarios.
  • See what make Requests so popular and easy to use.
  • Learn lessons we can implement next time we write an interface.

Disclaimer: Urllib was developed almost a decade earlier with a different set of language tools and requirements in mind.

This is an interactive talk.

Use case #1: sending a GET request

In [7]:
import urllib.request
urllib.request.urlopen('http://python.org/')
Out[7]:
<http.client.HTTPResponse at 0x7fa2b7183ef0>
In [12]:
import requests
requests.get('http://python.org/')
Out[12]:
<Response [200]>

Top level imports are nice!

  • Nicer visually.
  • The package API is exposed in the top namespace, separate from the actual implementation, hence easy to find it.
In [ ]:
# ...

from . import utils
from .models import Request, Response, PreparedRequest
from .api import request, get, head, post, patch, put, delete, options
from .sessions import session, Session
from .status_codes import codes
from .exceptions import (
    RequestException, Timeout, URLRequired,
    TooManyRedirects, HTTPError, ConnectionError,
    FileModeWarning,
)

# ...

Explicit (API endpoints) is better than implicit

  • Requests function name explicitly mark what it will do: requests.get.
  • Urllib function name is implicit: urllib.request.urlopen. It produce a GET request since it didn't receive a data argument.
  • This explicitness makes the interface easier for new users to understand
In [2]:
def request(method, url, **kwargs):
    with sessions.Session() as session:
        return session.request(method=method, url=url, **kwargs)

def get(url, params=None, **kwargs):
    kwargs.setdefault('allow_redirects', True)
    return request('get', url, params=params, **kwargs)

def post(url, data=None, json=None, **kwargs):
    return request('post', url, data=data, json=json, **kwargs)

Helpful object representation

In [ ]:
<Response [200]>
# vs.
<http.client.HTTPResponse at 0x7fa2b7183ef0>
  • Requests returns a helpful string with the request status code when examining it.
  • Urllib just returns the default (unclear) object representation.
  • __repr__ is a great idea, use it! and when you do, think what are the most significant attributes of the object to present.
In [ ]:
class Response(object):

    # ...

    def __repr__(self):
        return '<Response [%s]>' % (self.status_code)

Use case #2: getting a request status code

In [ ]:
import urllib.request
response = urllib.request.urlopen('http://python.org/')
response.getcode()
In [ ]:
import requests
r = requests.get('http://python.org/')
r.status_code

No need for getters and setters

http/client.py:

In [ ]:
class HTTPResponse(io.BufferedIOBase):

    # ...

    def getcode(self):
        return self.status
  • Urllib (or actually http) is using a "getter" to return a class property.
  • Accessing an object property as an actual property (and not a method call) makes the code a clearer and less verbose.
  • Coming from other object oriented language, you might be tempted to use getters and setters for encapsulation purposes: to have the ability to change the underlying data structure or data access without breaking the API.
  • No need for that in Python, the @property decorator enables just that, but in a gradual way.
  • To be fair, urllib was written before @property existed.
In [ ]:
class Response(object):
    
    # ---
    
    @property
    def ok(self):
        try:
            self.raise_for_status()
        except HTTPError:
            return False
        return True

Use case #3: handling errors

In [10]:
from urllib.request import urlopen
urlopen('http://www.httpbin.org/status/400')
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-10-f489128345a6> in <module>()
      1 from urllib.request import urlopen
----> 2 response = urlopen('http://www.httpbin.org/status/400')
      3 response.getcode()

/usr/lib/python3.5/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    161     else:
    162         opener = _opener
--> 163     return opener.open(url, data, timeout)
    164 
    165 def install_opener(opener):

/usr/lib/python3.5/urllib/request.py in open(self, fullurl, data, timeout)
    470         for processor in self.process_response.get(protocol, []):
    471             meth = getattr(processor, meth_name)
--> 472             response = meth(req, response)
    473 
    474         return response

/usr/lib/python3.5/urllib/request.py in http_response(self, request, response)
    580         if not (200 <= code < 300):
    581             response = self.parent.error(
--> 582                 'http', request, response, code, msg, hdrs)
    583 
    584         return response

/usr/lib/python3.5/urllib/request.py in error(self, proto, *args)
    508         if http_err:
    509             args = (dict, 'default', 'http_error_default') + orig_args
--> 510             return self._call_chain(*args)
    511 
    512 # XXX probably also want an abstract factory that knows when it makes

/usr/lib/python3.5/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    442         for handler in handlers:
    443             func = getattr(handler, meth_name)
--> 444             result = func(*args)
    445             if result is not None:
    446                 return result

/usr/lib/python3.5/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    588 class HTTPDefaultErrorHandler(BaseHandler):
    589     def http_error_default(self, req, fp, code, msg, hdrs):
--> 590         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    591 
    592 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 400: BAD REQUEST
In [12]:
import requests
requests.get('http://www.httpbin.org/status/400')
Out[12]:
<Response [400]>

With requests you can choose to raise an exception:

In [5]:
import requests
r = requests.get('http://www.httpbin.org/status/400')
r.raise_for_status()
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-5-7e2ad828dd53> in <module>()
      1 import requests
      2 r = requests.get('http://www.httpbin.org/geta')
----> 3 r.raise_for_status()

~/.virtualenvs/py-api-talk/lib/python3.5/site-packages/requests/models.py in raise_for_status(self)
    927 
    928         if http_error_msg:
--> 929             raise HTTPError(http_error_msg, response=self)
    930 
    931     def close(self):

HTTPError: 404 Client Error: NOT FOUND for url: http://www.httpbin.org/geta

Or, handle the error without one:

In [ ]:
import requests
r = requests.get('http://www.httpbin.org/status/400')
if r.ok:
    print('Success!')

Let the user choose how to handle errors

  • (This is a pretty opinionated idea)
  • Some programmers prefer exceptions, some prefer checks.
  • In some situations a check is much more elegant and sometimes it's the other way around. Let your users choose what to use when.
  • Defaulting to return codes allow that, while defaulting to exceptions do not.

Use case #4: encoding, sending and decoding a POST request

In [ ]:
import urllib.parse
import urllib.request
import json

url = 'http://www.httpbin.org/post'
values = {'name' : 'Michael Foord'}

data = urllib.parse.urlencode(values).encode()
# -> b'name=Michael+Foord'
response = urllib.request.urlopen(url, data)
body = response.read().decode()
json.loads(body)
In [ ]:
import requests

url = 'http://www.httpbin.org/post'
data = {'name' : 'Michael Foord'}

response = requests.post(url, data=data)
response.json()

Easy access to common functionality

  • Requests provides an out-of-the-box experience for the encoding of the data and loading the JSON response while in Urllib you have to implement those parts yourself.
  • When designing your API think: how will my package be commonly use? What plugs can I add to make that usage easier?

On the same note, requests also provides an elegant way to send JSON content:

In [35]:
import requests

url = 'http://www.httpbin.org/post'
data = {'name' : 'Michael Foord'}

requests.post(url, json=data)
Out[35]:
<Response [200]>

Use case #5: sending authenticated request

In [ ]:
import urllib.request

url = 'http://www.httpbin.org/basic-auth/user/pswd'

password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None, url, 'user', 'pswd')
handler = urllib.request.HTTPBasicAuthHandler(password_mgr)

opener = urllib.request.build_opener(handler)
opener.open(url)
In [ ]:
import requests

url = 'http://www.httpbin.org/basic-auth/user/pswd'

# Creating a persistent auth for all the requests 
session = requests.Session()
session.auth = ('user', 'pswd')
session.get(url)

# Or just for a single request:
requests.get(url, auth=('user', 'pswd'))

Provide possibilities for simple and advanced usage

  • Requests allow concise usage, when sending a single request, and a more verbose one for multiple requests.
  • Don't make the user go through a lengthy process when he needs a simple use case.

Prefer Python data types over self-made ones

  • Requests usage of Python's data structures makes it very easy and pretty to use.
  • No need to import and get to know another class that belongs to the package.
In [ ]:
class PreparedRequest(RequestEncodingMixin, RequestHooksMixin):
    
    # ...
    
    def prepare_auth(self, auth, url=''):

        # ...

        if auth:
            if isinstance(auth, tuple) and len(auth) == 2:
                # special-case basic HTTP auth
                auth = HTTPBasicAuth(*auth)
  • requests internally converts the (user,pass) tuple to an authentication class.

Questions?