How to use this site
This site gives access to a wide range of geographic data. The quantity of data is large and there is a range of ways to explore, view and extract data of interest to you. The pages here give an introduction to how to work with the site, but probably the best way is just to experiment. Any blue text on the site is a link: click around the site and see what you can find.
Most pages in this site have an API tab which includes details of how to access the data programmatically. See the API tab on this page describes some overarching principles.
The data on this site is organised into datasets. The first step to finding data is to select relevant datasets. Learn more about Finding Data.
A dataset in a this site is a collection of data on a topic, with shared metadata (e.g. license, description, modified date etc). Learn more about Datasets.
This site uses an approach called Linked Data for representing the information it holds. You don't need to know about this to use the site, but some background knowledge may be useful.
Rich, interactive documentation
The PublishMyData platform on which this site is based turns a graph database containing Linked Data into a browsable, searchable bank of documentation. When you just want to explore what's available, you'll appreciate the streamlined, space-efficient default view, but when it's time to dig into the gory details of how the data is implemented, you can dig as deep as you need via the other tabs.
Developing with our data
We provide a suite of features to make developing with our data easy, fun and productive.
Most pages in this site have an API tab which includes details of how to access the data programmatically. This page describes some overarching principles.
We provide a SPARQL endpoint which allows you to query the data store using the SPARQL 1.1 language. This is the most flexible way to access the data. In addition, we make it easy to write bookmarkable re-usable parameterised queries to speed up repetitive and/or frequent tasks. See the API tab on the SPARQL endpoint page for more details.
This lets you obtain the details about a Linked Data resource by looking up (or dereferencing) its URI, either
- (a) in a browser as a human readable HTML page
- (b) programatically in one of a number of serialisation formats
Following standard best practice for Linked Data, we distinguish between a real-world resource and documents describing that resource.
When you look up a real-world resource (e.g. a place) via its URI, you will be redirected (303: See Other) to the corresponding descriptive document. You end up viewing the same HTML page either way, but there's a semantic distinction (and a different HTTP status code). In cases where a URI identifies something that is essentially a document (an 'information resource') then we respond with a success (200: OK), as their URI and document page URL are one and the same. This will be the case for datasets, statistical observations, ontology terms and concept schemes.
On every page in this site that represents a Linked Data resource, the URI of the resource appears on the API tab. This identifies the resource in the database. If the URI is in the domain of this site (i.e. http://statistics.data.gov.uk) it can be dereferenced in the formats listed below simply by issuing a GET request to its URI.
In addition to dereferencing a resource's URI, to access information about individual resources, you can use the following URL pattern:
This is especially useful for resources for which we have information in our database, but which aren’t in the site’s domain (i.e. so you can’t dereference them in this site directly via their URIs). See also, our URI Dereferencing Tool in the Develop Menu.
For all requests to our API, if the request issues a query to the database which causes more than 10 MB of data to be returned, we will respond with HTTP status code 400, with the a message in the response body including the phrase
Response too large. Note that full dumps of all datasets are available (in n-triples format), and data cubes are additionally available in CSV format. See the API tab on dataset pages.
The table below details the various error responses that may be returned from an API request.
|Error type||HTTP status code||Notes|
|Response too large||400||The response body contains text including the phrase "Response too large."|
|SPARQL Syntax Error||400||The response body contains text with details of the error.|
|Resource Not Found||404||Returned if you request a resource or URL that doesn't exist.|
|Not Acceptable||406||Returned if you request a non-supported data format.|
|Query Timeouts||503||The timeout for requesting data from our database will initially be set to 10 seconds.|
All data in the database are stored in named graphs. The API tab of every Linked Data resource contains a table of information about that resource, along with which graph(s) the data are stored in.
Each dataset itself has a URI, e.g.
http://statistics.data.gov.uk/data/my-dataset. The metadata we store about each dataset (that is returned by dereferencing a dataset's URI), is stored its own separate graph, for example
The contents of each dataset are contained within separate named graphs, e.g.
http://statistics.data.gov.uk/graph/my-dataset. The graph name for a dataset's contents is specified in the metadata (using predicate
http://publishmydata.com/def/dataset#graph> (and mentioned on the API tab of the dataset page).
Each vocabulary is contained in its own graph. Vocabularies don't have separate metadata graphs.