Using the Frontier with Scrapy

Using Crawl Frontier is quite easy, it includes a set of Scrapy middlewares that encapsulates frontier usage and can be easily configured using Scrapy settings.

Activating the frontier

The frontier uses 2 different middlewares: CrawlFrontierSpiderMiddleware and CrawlFrontierDownloaderMiddleware.

To activate the frontier in your Scrapy project, just add them to the SPIDER_MIDDLEWARES and DOWNLOADER_MIDDLEWARES settings:

    '': 1000,

    '': 1000,

Create a Crawl Frontier file and add it to your Scrapy settings:

FRONTIER_SETTINGS = 'tutorial/frontier/'

Organizing files

When using frontier with a Scrapy project, we propose the following directory structure:


These are basically:

  • my_scrapy_project/frontier/ the frontier settings file.
  • my_scrapy_project/frontier/ the middlewares used by the frontier.
  • my_scrapy_project/frontier/ the backend(s) used by the frontier.
  • my_scrapy_project/spiders: the Scrapy spiders folder
  • my_scrapy_project/ the Scrapy settings file
  • scrapy.cfg: the Scrapy config file

Running the Crawl

Just run your Scrapy spider as usual from the command line:

scrapy crawl myspider

In case you need to disable frontier, you can do it by overriding the FRONTIER_ENABLED setting:

scrapy crawl myspider -s FRONTIER_ENABLED=False

Frontier Scrapy settings

Here’s a list of all available Crawl Frontier Scrapy settings, in alphabetical order, along with their default values and the scope where they apply:


Default: True

Whether to enable frontier in your Scrapy project.


Default: 256

Number of concurrent requests that the middleware will maintain while asking for next pages.


Default: 0.01

Interval of number of requests check in seconds. Indicates how often the frontier will be asked for new pages if there is gap for new requests.


Default: None

A file path pointing to Crawl Frontier settings.