HoverPy 0.1.14 Documentation¶
Documentation¶
Motivation¶
HoverPy speeds up and simplifies tests that depend on HTTP / HTTPS services. It does so by recording all HTTP traffic generated by your python application inside a database file.
When you run your code again, it plays back the responses corresponding to your requests. This means during the simulate phase, no HTTP traffic gets generated whatsoever. This grants several benefits:
- Increased test speed
- Ability to work offline
- Ability to modify traffic
- Ability to simulate network latency
- Deterministic test environment
If/when the service you are testing against changes its API, then you can simply delete your db file, and capture the test results again. HoverPy uses a very high performance proxy written in Go, for this reason it is rock solid in terms of speed and reliability.
License¶
HoverPy uses Apache License V2. See LICENSE.txt for more details.
Contents¶
Installation¶
Cloning¶
We’re currently in development mode, so you’re better off cloning the repository.
$ git clone https://github.com/SpectoLabs/hoverpy.git
$ cd hoverpy
$ virtualenv .venv
$ source .venv/bin/activate
$ python setup.py install
This installs hoverpy and its requirements in your .venv folder; make sure to pull often, and run the python setup.py install
when you do.
Testing¶
Please make sure everything is working before proceeding to the next steps.
$ python setup.py test
You should get a series of OKs
.
Output:
...
testModify (hoverpy.tests.modify.testModify.TestModify) ... ok
testTemplate (hoverpy.tests.templates.testTemplates.TestTemplates) ... ok
testCapture (hoverpy.tests.testVirtualisation.TestVirt) ... ok
testPlayback (hoverpy.tests.testVirtualisation.TestVirt) ... ok
Running the examples¶
$ ls examples/*
$ basic delays modify readthedocs tornado unittesting urllib2eg urllib3eg
Please note we’ll cover the examples in the usage page. But for the truly impatient, you can try running the most basic example, just to make sure everything’s working at this point.
$ python examples/basic/basic.py
Installing from repo¶
Please note there isn’t yet much point installing HoverPy since we’re currently still in development mode. This means the code will change often. But if you really want to, then here’s the command:
$ sudo python setup.py install
Installing from PIP¶
You can also install HoverPy from PIP, however once again you’re better off playing with your repo clone for now. But if you really wish so, then here’s how you can do so:
$ pip install -i https://testpypi.python.org/pypi hoverpy
HoverFly binary¶
Please note that when you install HoverPy, the HoverFly binaries get downloaded and installed in your home directory, in
${home}/.hoverfly/bin/dist_vX.X.X/${OS}_${ARCH}/hoverfly
Introduction¶
Building and testing interdependent applications is difficult. Maybe you’re building a mobile application that needs to talk to a legacy API. Or a microservice that relies on two other services that are still in development The problem is the same: how do you develop and test against external dependencies which you cannot control?
You could use mocking libraries as substitutes for external dependencies. But mocks are intrusive, and do not test all the way to the architectural boundary of your application. This means the client code for your external dependency is substituted and not tested.
Stub services are better, but they often involve too much configuration or may not be transparent to your application. Then there is the problem of managing test data. Often, to write proper tests, you need fine-grained control over the data in your mocks or stubs. Managing test data across large projects with multiple teams introduces bottlenecks that impact delivery times.
Integration testing “over the wire” is problematic too. When stubs or mocks are substituted for real services (in a continuous integration environment for example) new variables are introduced. Network latency and random outages can cause integration tests to fail unexpectedly.
Service Virtualisation is software that records the interactions between you, and the big unpredictible world.
A very sturdy software solution for Service Virtualisation is Mirage, which is used extensively in the airline industry. Its successor, HoverFly, has taken all the lessons learned in the years of use of Mirage. Both Mirage and Hoverfly are open source software, developed at specto.io.
HoverPy is the thin layer between Python and HoverFly. HoverFly is a light-weight and extremely fast proxy written in Go, and does the heavy lifting for HoverPy. So a more accurate picture might be:
Feature overview¶
- “Capture” traffic between a client and a server application
- Use captured traffic to simulate the server application
- Export captured service data as a JSON file
- Import service data JSON files
- Simulate latency by specifying delays which can be applied to individual URLs based on regex patterns, or based on HTTP method
- Flexible request matching using templates
- Supports “middleware” (which can be written in any language) to manipulate data in requests or responses, or to simulate unexpected behaviour such as malformed responses or random errors
- Supports local or remote middleware execution (for example on AWS Lambda)
- Uses BoltDB to persist data in a binary file on disk - so no additional database is required
- REST API
- Run as a transparent proxy or as a webserver
- High performance with minimal overhead
- JUnit rule “wrapper” is available as a Maven dependency
- Supports HTTPS and can generate certificates if required
- Authentication (combination of Basic Auth and JWT)
- Admin UI to change state and view basic metrics
Use cases¶
Hoverfly is designed to cater for two high-level use cases. Capturing real HTTP(S) traffic between an application and an external service for re-use in testing or development.
If the external service you want to simulate already exists, you can put Hoverfly in between your client application and the external service. Hoverfly can then capture every request from the client application and every matching response from the external service (capture mode).
These request/response pairs are persisted in Hoverfly, and can be exported to a service data JSON file. The service data file can be stored elsewhere (a Git repository, for example), modified as required, then imported back into Hoverfly (or into another Hoverfly instance).
Hoverfly can then act as a “surrogate” for the external service, returning a matched response for every request it received (simulate mode). This is useful if you want to create a portable, self-contained version of an external service to develop and test against.
This could allow you to get around the problem of rate-limiting (which can be frustrating when working with a public API) You can write Hoverfly extensions to manipulate the data in pre-recorded responses, or to simulate network latency.
You could work while offline, or you could speed up your workflow by replacing a slow dependency with a fast Hoverfly “surrogate”.
Creating simulated services for use in a testing or development.¶
In some cases, the external service you want to simulate might not exist yet. You can create service simulations by writing service data JSON files. This is in line with the principle of design by contract development. Service data files can be created by each developer, then stored in a Git repository. Other developers can then import the service data directly from the repository URL, providing them with a Hoverfly “surrogate” to work with. Instead of writing a service data file, you could write a “middleware” script for Hoverfly that generates a response “on the fly”, based on the request it receives (synthesize mode). More information on this use-case is available here: Synthetic service example Easy API simulation with the Hoverfly JUnit rule Proceed to the “Modes” and middleware section to understand how Hoverfly is used in these contexts.
Modes and middleware¶
Hoverfly modes¶
Hoverfly has four modes. Detailed guides on how to use these modes are available in the Usage section.
In this mode, Hoverfly acts as a proxy between the client application and the external service. It transparently intercepts and stores outgoing requests from the client and matching incoming responses from the external service. This is how you capture real traffic for use in development or testing.
In this mode, Hoverfly uses either previously captured traffic, or imported service data files to mimic the external service. This is useful if you are developing or testing an application that needs to talk to an external service that you don’t have reliable access to. You can use the Hoverfly “surrogate” instead of the real service.
In this mode, Hoverfly doesn’t use any stored request/response pairs. Instead, it generates responses to incoming requests on the fly and returns them to the client. This mode is dependent on middleware (see below) to generate the responses.
This is useful if you can’t (or don’t want to) capture real traffic, or if you don’t want to write service data files.
In this mode, Hoverfly passes requests through from to the server, and passes the responses back. However, it also executes middleware on the requests and responses. This is useful for all kinds of things such as manipulating the data in requests and/or responses on the fly.
Middleware¶
Middleware can be written in any language, as long as that language is supported by the Hoverfly host. For example, you could write middleware in Go, Python or JavaScript (if you have Go, Python or NodeJS installed on the Hoverfly host, respectively).
Middleware is applied to the requests and/or the responses depending on the mode:
- Capture Mode: middleware affects only outgoing requests
- Simulate Mode: middleware affects only responses (cache contents remain untouched)
- Synthesize Mode: middleware creates responses
- Modify Mode: middleware affects requests and responses
- Middleware can be used to do many useful things, such as simulating network latency or failure, rate limits or controlling data in requests and responses.
A detailed guide on how to use middleware is available in the Usage section.
Usage¶
I don’t know about you, but for me the best way of getting into things is by trying them out. In the articles below I take you through simple, but then increasily complex examples of testing heaven using HoverPy.
basic¶
This is by far the simplest example on how to get started with HoverPy. Please run this example using:
$ python examples/basic/basic.py
You should see your IP address show up twice. Let’s walk through the code to see what’s happening.
>>> from hoverpy import HoverPy
>>> import requests
Above, we start by importing our most important class HoverPy. We also bring in requests
for our http traffic.
Now let’s create our HoverPy object in capture mode. We do so with a with statement as this is the pythonic way, although this is not a necessity.
>>> with HoverPy(capture=True) as hoverpy:
Print the json from our get request. Hoverpy acted as a proxy: it made the request on our behalf, captured it, and returned it to us.
>>> print(requests.get("http://ip.jsontest.com/myip").json())
Switch HoverPy to simulate mode. HoverPy no longer acts as a proxy; all it does from now on is replay the captured data.
>>> hoverpy.simulate()
Print the json from our get request. This time the data comes from the store.
>>> print(requests.get("http://ip.jsontest.com/myip").json())
Requests.db¶
You may have noticed this created a requests.db
inside your current directory. This is a boltdb database, holding our requests, and their responses.
readthedocs¶
This is a slightly more advanced example, where we query readthedocs.io for articles. In the first phase, we run the program in capture mode. This is done using the capture flag:
python examples/readthedocs/readthedocs.py --capture
the program can then be run again in simulate mode, in a fraction of the time:
python examples/readthedocs/readthedocs.py
We’ll now run through the code to see what it’s doing.
>>> from hoverpy import HoverPy
>>> import requests
>>> import time
We obviously start our program by doing the usual imports. We’re using the time
module to time our code.
>>> from argparse import ArgumentParser
>>> parser = ArgumentParser(description="Perform proxy testing/URL list creation")
>>> parser.add_argument("--capture", help="capture the data", action="store_true")
>>> parser.add_argument(
>>> "--limit", default=50, help="number of links to capture / simulate")
>>> args = parser.parse_args()
As you can see, we’re setting up our program with the --capture
flag, which either sets us up in capture mode if used, or simulate mode if not. The --limit
flag can be used to increase the number of articles we fetch, however 50 is a good default value.
>>> def getLinks(hp, limit):
>>> print("\nGetting links in %s mode!\n" % hp.mode())
>>> start = time.time()
>>> sites = requests.get(
>>> "http://readthedocs.org/api/v1/project/?limit="
>>> "%d&offset=0&format=json" % int(limit))
>>> objects = sites.json()['objects']
>>> links = ["http://readthedocs.org" + x['resource_uri'] for x in objects]
>>> for link in links:
>>> response = requests.get(link)
>>> print("url: %s, status code: %s" % (link, response.status_code))
>>> print("Time taken: %f" % (time.time() - start))
The function above gets the 50 articles from readthedocs, and prints how long it took once we’re done.
>>> if __name__ == "__main__":
>>> with HoverPy(capture=args.capture) as hp:
>>> getLinks(hp, args.limit)
Finally our program is run. Results will vary based on your internet speed, but running in simulate mode should run around 50x
to 100x
faster.
unittesting¶
In this example, we’ll take a look at writing unit tests that use HoverPy. Please note that doing so means that you, as a developer, can be entirely sure that you are testing your code against known data. This makes you hermetic to issues with third party APIs. Let’s begin by importing hoverpy.
>>> from hoverpy import testing
Instead of inheriting off unittest.TestCase let’s inherit off hoverpy.testing.TestCase
>>> class TestRTD(testing.TestCase):
In our test, we’ll once again download a load of readthedocs pages
>>> def test_rtd_links(self):
>>> import requests
>>> limit = 50
>>> sites = requests.get(
>>> "http://readthedocs.org/api/v1/project/?"
>>> "limit=%d&offset=0&format=json" % limit)
>>> objects = sites.json()['objects']
>>> links = ["http://readthedocs.org" + x['resource_uri'] for x in objects]
>>> self.assertTrue(len(links) == limit)
>>> for link in links:
>>> response = requests.get(link)
>>> print(link, response)
>>> self.assertTrue(response.status_code == 200)
Let’s run our hoverpy testcase if the script is invoked directly
>>> if __name__ == '__main__':
>>> import unittest
>>> unittest.main()
Now the correct way of launching this script the first time is:
$ env HOVERPY_CAPTURE=true python examples/unittesting/unittesting.py
which sets HoverPy in capture mode, and creates our all important requests.db
. This process may take around 10 seconds depending on your internet speed. Now when we rerun our unit tests, we’re always running against the data we captured in requests.db
.
$ python examples/unittesting/unittesting.py
This time we are done in around 100ms! Not to mention: no more unnecessary breakages.
delays¶
Demonstrates how to add latency to calls, based on host, and method type. import hoverpy’s main class: HoverPy
>>> from hoverpy import HoverPy
Import requests and random for http and testing
>>> import requests
>>> import random
Create our HoverPy object in capture mode
>>> with HoverPy(capture=True) as hp:
This function either generates a echo server url, or a md5 url it is seeded so that we get the exact same requests on capture as we do on simulate
>>> def getServiceData():
>>> for i in range(10):
>>> random.seed(i)
>>> print(
>>> requests.get(
>>> random.choice(
>>> [
>>> "http://echo.jsontest.com/i/%i" %
>>> i,
>>> "http://md5.jsontest.com/?text=%i" %
>>> i])).json())
Make the requests to the desired host dependencies
>>> print("capturing responses from echo server\n")
>>> getServiceData()
There are two ways to add delays. One is to call the delays method with the desired delay rules passed in as a json document
>>> print(hp.delays({"data": [
>>> {
>>> "urlPattern": "md5.jsontest.com",
>>> "delay": 1000
>>> }
>>> ]
>>> }
>>> ))
The other more pythonic way is to call addDelay(...)
>>> print(hp.addDelay(urlPattern="echo.jsontest.com", delay=3000))
Now let’s switch over to simulate mode
>>> print(hp.simulate())
Make the requests. This time HoverFly adds the simulated delays. these requests would normally be run asynchronously, and we could deal gracefully with the dependency taking too long to respond
>>> print("\nreplaying delayed responses from echo server\n")
>>> getServiceData()
modify¶
Let’s look into mutating responses using middleware. This is particularly useful for sending curved balls to your applications, and make sure they deal with them correctly.
>>> from hoverpy import HoverPy
>>> import requests
>>> with HoverPy(
>>> modify=True,
>>> middleware="python examples/modify/modify_payload.py") as hoverpy:
Above we created our HoverPy object with modify and middleware enabled. Please note this brings in python examples/modify/modify_payload.py
which will get run on every request.
>>> for i in range(30):
>>> r = requests.get("http://time.jsontest.com")
Let’s make 30 requests to http://time.jsontest.com which simply gets us the current local time
>>> if "time" in r.json().keys():
>>> print(
>>> "response successfully modified, current date is " +
>>> r.json()["time"])
The time
key is inside the response, which is what we expected.
>>> else:
>>> print("something went wrong - deal with it gracefully")
However if the time
key isn’t in the response, then something clearly went wrong. Next let’s take a look at the middleware.
modify_payload¶
>>> #!/usr/bin/env python
This is the payload modification script. It truly allows us to do all types of weird, wild and wonderful mutations to the data that gets sent back to our application. Let’s begin by imporing what we’ll need.
>>> import sys
>>> import json
>>> import logging
>>> import random
>>> logging.basicConfig(filename='middleware.log', level=logging.DEBUG)
>>> logging.debug('Middleware "modify_request" called')
Above we’ve also configured our logging. This is essential, as it’s difficult to figure out what went wrong otherwise.
>>> def main():
>>> data = sys.stdin.readlines()
>>> payload = data[0]
>>> logging.debug(payload)
>>> payload_dict = json.loads(payload)
The response to our request gets sent to middleware via stdin. Therefore, we are really only interested in the first line.
>>> payload_dict['response']['status'] = random.choice([200, 201])
Let’s randomly switch the status for the responses between 200, and 201. This helps us build a resilient client, that can deal with curved balls.
>>> if random.choice([True, False]):
>>> payload_dict['response']['body'] = "{}"
Let’s also randomly return an empty response body. This is tricky middleware indeed.
>>> print(json.dumps(payload_dict))
>>> if __name__ == "__main__":
>>> main()
If is good practice for your client to be able to deal with unexpected data. This is a great example building middleware that’ll thoroughly test your apps.
We are now ready to run our payload modification script: python examples/modify/modify.py
Output:
>>> something went wrong - deal with it gracefully
>>> something went wrong - deal with it gracefully
>>> something went wrong - deal with it gracefully
>>> something went wrong - deal with it gracefully
>>> something went wrong - deal with it gracefully
>>> response successfully modified, current date is 01:45:15 PM
>>> something went wrong - deal with it gracefully
>>> something went wrong - deal with it gracefully
>>> response successfully modified, current date is 01:45:16 PM
>>> [...]
Excellent, above we can see how our application now deals with dire responses adequately. This is how resilient software is built!
urllib2eg¶
Import hoverpy’s main class: HoverPy
>>> from hoverpy import HoverPy
Create our HoverPy object in capture mode
>>> with HoverPy(capture=True) as hp:
Import urllib2 for http
>>> import urllib2
Build our proxy handler for urllib2. This is currently a rather crude method of initialising urllib2, and this code will be incorporated into the main library shortly.
>>> proxy = urllib2.ProxyHandler({'http': 'localhost:8500'})
>>> opener = urllib2.build_opener(proxy)
>>> urllib2.install_opener(opener)
Print the json from our get request. Hoverpy acted as a proxy: it made the request on our behalf, captured it, and returned it to us.
>>> print(urllib2.urlopen("http://ip.jsontest.com/myip").read())
Switch HoverPy to simulate mode. HoverPy no longer acts as a proxy; all it does from now on is replay the captured data.
>>> hp.simulate()
Print the json from our get request. This time the data comes from the store.
>>> print(urllib2.urlopen("http://ip.jsontest.com/myip").read())
urllib3eg¶
Import hoverpy’s main class: HoverPy
>>> from hoverpy import HoverPy
Create our HoverPy object in capture mode
>>> with HoverPy(capture=True) as hp:
Import urllib3 for http, and build a proxy manager
>>> import urllib3
>>> http = urllib3.proxy_from_url("http://localhost:8500/")
Print the json from our get request. Hoverpy acted as a proxy: it made the request on our behalf, captured it, and returned it to us.
>>> print(http.request('GET', 'http://ip.jsontest.com/myip').data)
Switch HoverPy to simulate mode. HoverPy no longer acts as a proxy; all it does from now on is replay the captured data.
>>> hp.simulate()
Print the json from our get request. This time the data comes from the store.
>>> print(http.request('GET', 'http://ip.jsontest.com/myip').data)
soap¶
In this example we’ll take a look at using hoverpy when working with SOAP. To run this example, simply execute:
examples/soap/soap.py --capture
which runs the program in capture mode, then:
examples/soap/soap.py
Which simply runs our program in simulate mode.
This program gets our IP address from http://jsontest.com
, then uses it to do some geolocation using a WDSL SOAP web service. In my case, I’m getting this:
{
'ResolveIPResult':{
'City':u'London',
'HasDaylightSavings':False,
'CountryCode':u'GB',
'AreaCode':u'0',
'Country':u'United Kingdom',
'StateProvince':u'H9',
'Longitude':-0.09550476,
'TimeZone':None,
'Latitude':51.5092,
'Organization':None,
'Certainty':90,
'RegionName':None
}
}
Which is what the ip2geo
service thinks is the location of the SpectoLabs office!
from hoverpy import HoverPy
import pysimplesoap
import requests
Above, we bring in our usual suspect libraries. Namely the HoverPy
class, pysimplesoap
which is a straight forward SOAP client, and the requests
library.
from argparse import ArgumentParser
parser = ArgumentParser(description="Perform proxy testing/URL list creation")
parser.add_argument("--capture", help="capture the data", action="store_true")
args = parser.parse_args()
We use argparse so we can run our app in --capture
mode first.
with HoverPy(capture=args.capture):
We then construct HoverPy either in capture, or simulate mode, depending on the flag provided.
ipAddress = requests.get("http://ip.jsontest.com/myip").json()["ip"]
We then make a get HTTP request to http://ip.jsontest.com
for our IP address. This is very similar to our basic example.
pysimplesoap.transport.set_http_wrapper("urllib2")
We now tell pysimplesoap
to use urllib2
, this is because urllib2 happens to play well with proxies.
client = pysimplesoap.client.SoapClient(
wsdl='http://ws.cdyne.com/ip2geo/ip2geo.asmx?WSDL'
)
We then build our SOAP client, pointing to the ip2go WSDL schema description URL.
print(client.ResolveIP(ipAddress=ipAddress, licenseKey="0"))
We finally invoke the ResolveIP
method on our SOAP client. So to resume, in this example we built a program that gets our IP address from one external service, and then builds a SOAP client using a WSDL schema description, and finally queries the SOAP service for our location using said IP address.
If you really want to prove to yourself that hoverfly is indeed playing back the requests, then you can run the script in simulate mode without an internet connection. Timing our script also shows us we’re now running approximately 10x faster.
modify soap¶
In this example we’ll take a look at using hoverpy in conjunction with middleware to modify SOAP data. This example builds upon the previous SOAP example, so I strongly suggest you do that one first.
examples/soap/soapModify.py¶
with HoverPy(modify=True, middleware="python examples/soap/modify_payload.py"):
Above, the only real difference with examples/soap/soap.py
is that we’re loading up HoverPy with middleware enabled.
print(client.ResolveIP(ipAddress=ipAddress, licenseKey="0"))
When running this script with python examples/soap/soapModify.py
you should notice your city is ‘New York’. That’s the middleware modifying the result of our SOAP operation.
The XML from ip2geo¶
Before jumping into the middleware, let’s see what we’ll be modifying.
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soap:Body>
<ResolveIPResponse xmlns="http://ws.cdyne.com/">
<ResolveIPResult>
<City>New York</City>
<StateProvince>H9</StateProvince>
<Country>United Kingdom</Country>
<Organization />
<Latitude>51.5092</Latitude>
<Longitude>-0.09550476</Longitude>
<AreaCode>0</AreaCode>
<TimeZone />
<HasDaylightSavings>false</HasDaylightSavings>
<Certainty>90</Certainty>
<RegionName />
<CountryCode>GB</CountryCode>
</ResolveIPResult>
</ResolveIPResponse>
</soap:Body>
</soap:Envelope>
This is the XML that gets sent back to us after calling the ResolveIP
method, as defined in http://ws.cdyne.com/ip2geo/ip2geo.asmx?WSDL. We are interested in modifying the City
node.
examples/soap/modify_payload.py¶
And here are the important parts of our payload modification script.
from lxml import objectify
from lxml import etree
Above we make sure we are importing the lxml classes that will help us modify the data.
if "response" in payload_dict and "body" in payload_dict["response"]:
body = payload_dict["response"]["body"]
try:
Let’s make sure we only operate when we have a response, and it has a body.
root = objectify.fromstring(str(body))
ns = "{http://ws.cdyne.com/}"
logging.debug("transforming")
ipe = ns + "ResolveIPResponse"
ipt = ns + "ResolveIPResult"
root.Body[ipe][ipt].City = "New York"
We parse our xml and turn it into an object. Remember that our program gets our IP address, then tries to geo-locate us based on our IP. The intent of our middleware is to override the city no matter what.
objectify.deannotate(root.Body[ipe][ipt].City)
etree.cleanup_namespaces(root.Body[ipe][ipt].City)
payload_dict["response"]["body"] = etree.tostring(root)
We finally remove annotations and namespaces that got added to the City element by the objectify library, and serialise the modified body back into the response. And we are done.