The pages here give an introduction to how to work with the range of geographic data on this website. We also encourage you to experiment. Any blue text on the site is a link: click around the site and see what you can find.
The data on this site is organised into datasets. The first step to finding data is to select relevant datasets. Learn more about Finding Data. To get started quickly: use the search menu at the top of this page, and enter keywords related to your topic of interest.
A dataset is a collection of data on a topic, with shared metadata (e.g. licence, description, modified date). To filter a dataset to find the data you want, start by choosing (or 'locking') the values of the data you are interested in. Learn more about exploring Datasets.
A simple but important principle of the system is that every dataset, every data point, and every view of the data has a URL that you can link to, or bookmark, or send to someone else.
This site uses an approach called Linked Data for representing the information it holds. You don't need to know about this to use the site, but some background knowledge may be useful.
The PublishMyData platform on which this site is based turns a graph database containing Linked Data into a browsable, searchable bank of documentation. When you just want to explore what's available, you'll appreciate the streamlined, space-efficient default view, but when it's time to dig into the gory details of how the data is implemented, you can dig as deep as you need via the other tabs.
We provide a suite of features to make developing with our data easy, fun and productive.
Most pages in this site have an API tab which includes details of how to access the data programmatically. This page describes some overarching principles. Note: in the Atlas section this information is in the 'Technical Information' section near the bottom of the page.
We provide a SPARQL endpoint which allows you to query the data store using the SPARQL 1.1 language. This is the most flexible way to access the data. In addition, we make it easy to write bookmarkable re-usable parameterised queries to speed up repetitive and/or frequent tasks. See the API tab on the SPARQL endpoint page for more details.
This lets you obtain the details about a Linked Data resource by looking up (or dereferencing) its URI, either
Following standard best practice for Linked Data, we distinguish between a real-world resource and documents describing that resource.
When you look up a real-world resource (e.g. a place) via its URI, you will be redirected (303: See Other) to the corresponding descriptive document. You end up viewing the same HTML page either way, but there's a semantic distinction (and a different HTTP status code). In cases where a URI identifies something that is essentially a document (an 'information resource') then we respond with a success (200: OK), as their URI and document page URL are one and the same. This will be the case for datasets, statistical observations, ontology terms and concept schemes.
On every page in this site that represents a Linked Data resource, the URI of the resource appears on the API tab. This identifies the resource in the database. If the URI is in the domain of this site (i.e. http://statistics.data.gov.uk) it can be dereferenced in the formats listed below simply by issuing a GET request to its URI.
In addition to dereferencing a resource's URI, to access information about individual resources, you can use the following URL pattern:
This is especially useful for resources for which we have information in our database, but which aren’t in the site’s domain (i.e. so you can’t dereference them in this site directly via their URIs). See also, our URI Dereferencing Tool in the Develop Menu.
For all requests to our API, if the request issues a query to the database which causes more than 10 MB of data to be returned, we will respond with HTTP status code 400, with the a message in the response body including the phrase
Response too large. Note that full dumps of all datasets are available (in n-triples format), and data cubes are additionally available in CSV format. See the API tab on dataset pages.
The table below details the various error responses that may be returned from an API request.
|Error type||HTTP status code||Notes|
|Response too large||400||The response body contains text including the phrase "Response too large."|
|SPARQL Syntax Error||400||The response body contains text with details of the error.|
|Resource Not Found||404||Returned if you request a resource or URL that doesn't exist.|
|Not Acceptable||406||Returned if you request a non-supported data format.|
|Query Timeouts||503||The timeout for requesting data from our database will initially be set to 10 seconds.|
All data in the database are stored in named graphs. The API tab of every Linked Data resource contains a table of information about that resource, along with which graph(s) the data are stored in.
Each dataset itself has a URI, e.g.
http://statistics.data.gov.uk/data/my-dataset. The metadata we store about each dataset (that is returned by dereferencing a dataset's URI), is stored its own separate graph, for example
The contents of each dataset are contained within separate named graphs, e.g.
http://statistics.data.gov.uk/graph/my-dataset. The graph name for a dataset's contents is specified in the metadata (using predicate
http://publishmydata.com/def/dataset#graph> (and mentioned on the API tab of the dataset page).
Each vocabulary is contained in its own graph. Vocabularies don't have separate metadata graphs.