Tests¶
Frontera tests are implemented using the pytest tool.
You can install pytest and the additional required libraries used in the tests using pip:
pip install -r requirements/tests.txt
Writing tests¶
All functionality (including new features and bug fixes) must include a test case to check that it works as expected, so please include tests for your patches if you want them to get accepted sooner.
Backend testing¶
A base pytest class for Backend
testing is provided:
BackendTest
-
class
frontera.tests.backends.
BackendTest
¶ A simple pytest base class with helper methods for
Backend
testing.-
get_settings
()¶ Returns backend settings
-
get_frontier
()¶ Returns frontierManager object
-
setup_backend
(method)¶ Setup method called before each test method call
-
teardown_backend
(method)¶ Teardown method called after each test method call
-
Let’s say for instance that you want to test to your backend MyBackend
and create a new frontier instance for each
test method call, you can define a test class like this:
class TestMyBackend(backends.BackendTest):
backend_class = 'frontera.contrib.backend.abackend.MyBackend'
def test_one(self):
frontier = self.get_frontier()
...
def test_two(self):
frontier = self.get_frontier()
...
...
And let’s say too that it uses a database file and you need to clean it before and after each test:
class TestMyBackend(backends.BackendTest):
backend_class = 'frontera.contrib.backend.abackend.MyBackend'
def setup_backend(self, method):
self._delete_test_db()
def teardown_backend(self, method):
self._delete_test_db()
def _delete_test_db(self):
try:
os.remove('mytestdb.db')
except OSError:
pass
def test_one(self):
frontier = self.get_frontier()
...
def test_two(self):
frontier = self.get_frontier()
...
...
Testing backend sequences¶
To test Backend
crawling sequences you can use the
BackendSequenceTest
class.
-
class
frontera.tests.backends.
BackendSequenceTest
¶ A pytest base class for testing
Backend
crawling sequences.-
get_sequence
(site_list, max_next_requests, downloader_simulator=<frontera.utils.tester.BaseDownloaderSimulator object>, frontier_tester=<class 'frontera.utils.tester.FrontierTester'>)¶ Returns an Frontera iteration sequence from a site list
Parameters: - site_list (list) – A list of sites to use as frontier seeds.
- max_next_requests (int) – Max next requests for the frontier.
-
assert_sequence
(site_list, expected_sequence, max_next_requests)¶ Asserts that crawling sequence is the one expected
Parameters: - site_list (list) – A list of sites to use as frontier seeds.
- max_next_requests (int) – Max next requests for the frontier.
-
BackendSequenceTest
class will run a complete crawl of the passed
site graphs and return the sequence used by the backend for visiting the different pages.
Let’s say you want to test to a backend that sort pages using alphabetic order. You can define the following test:
class TestAlphabeticSortBackend(backends.BackendSequenceTest):
backend_class = 'frontera.contrib.backend.abackend.AlphabeticSortBackend'
SITE_LIST = [
[
('C', []),
('B', []),
('A', []),
],
]
def test_one(self):
# Check sequence is the expected one
self.assert_sequence(site_list=self.SITE_LIST,
expected_sequence=['A', 'B', 'C'],
max_next_requests=0)
def test_two(self):
# Get sequence and work with it
sequence = self.get_sequence(site_list=SITE_LIST,
max_next_requests=0)
assert len(sequence) > 2
...
Testing basic algorithms¶
If your backend uses any of the basic algorithms logics, you can just inherit the correponding test base class for each logic and sequences will be automatically tested for it:
from frontera.tests import backends
class TestMyBackendFIFO(backends.FIFOBackendTest):
backend_class = 'frontera.contrib.backends.abackend.MyBackendFIFO'
class TestMyBackendLIFO(backends.LIFOBackendTest):
backend_class = 'frontera.contrib.backends.abackend.MyBackendLIFO'
class TestMyBackendDFS(backends.DFSBackendTest):
backend_class = 'frontera.contrib.backends.abackend.MyBackendDFS'
class TestMyBackendBFS(backends.BFSBackendTest):
backend_class = 'frontera.contrib.backends.abackend.MyBackendBFS'
class TestMyBackendRANDOM(backends.RANDOMBackendTest):
backend_class = 'frontera.contrib.backends.abackend.MyBackendRANDOM'