Feeds are available over our HTTP web service. No feed needs to be collected more than once on a 24hr basis. Collecting feeds more frequently than this will simply result in increased and needless network traffic for your application.

There are many methods you can use to collect via HTTP. In a Unix environment tools like Wget and cURL can collect the feeds and store them locally. There is also handy client cURL PHP library to use inside your PHP application.

Service Performance and Rate Limiters

Rate limiters exist on the service to prevent the abuse of the service. Each server in the farm has a limit to the number of simultaneous feed requests of a given type which can be performed.

The limits apply to a specific feed URL, are all separately managed within the service. These limits are:

  • 3 concurrent requests per service per customer per server.
  • 9 concurrent requests per service per server.

This prevents the overloading of our service by abusive implementations that perform highly parallel requests. Highly parallelised processor implementations, or implementations that feed through page requests to the XML event engine directly without caching results are an unsupported use of our system. If you have performance problems, notify us with a description of the problem you're encountering to tech@whatsonwhen.com for us to review.

The Import Process

Ensure that whatever import process you design is capable of working from a clean slate - you will want to remove all article content from the system and re-pull a complete feed on a monthly basis, to ensure that your database does not grow beyond sanity. One of the reasons our feed system separates the list of ids which are 'valid' within the feed and the actual article/location/category content is because the list of what is valid is based on what is valid at the time of request - as things change, the valid list changes, as expected; but the feed is also going to automatically drop articles from its listing which become invalid (eg. they might drop out of date scope). Cleaning the database once a month is recommended. If an article becomes valid again (eg a future date added to it) it will automatically re-appear in the feed.

Also, note that feeds take a last modified timestamp; whenever possible, your feed operations should be atomic. Your logic should be:

  1. Ask for a list of IDs from the datestamp of the last successful feed operation.
  2. Retrieve detailed content for each ID - the only IDs which will appear in that list are those which have been updated since the timestamp you provided. If you have an existing record or file for that, overwrite it with the new data.
  3. Update your recorded timestamp.