How to use the IMPORTXML formula in Google Sheets

Get a selection of expert articles

The IMPORTXML Google Sheets function lets you import data from structured data types like HTML and XML. It also supports data import from CSV, TSV, RSS and ATOM XML feeds. This makes it a great tool to import data from websites for web scraping or data mining.

Syntax

=IMPORTXML(url, xpath_query, locale)

  • url – the address of the structured web page from which you want to import data. There are two ways to include this:
    • As a fully qualified location in the form of text between quotations (for example, “https://en.wikipedia.org/wiki/Tiger_Woods”)
    • As a reference to a cell that holds the url.
  • xpath_query – this is the XPath query to run on the structured data. Simply put, this specifies the information we are trying to fetch. For instance, using “//h1/@id” gives all the id attributes within the anchor <h1> tags on the page. Check out XPath syntax guidelines for detailed instructions.

If this is new to you, you may be a bit confused when looking at the xpath_query parameter. But it is just as powerful as it is confusing, so we encourage you to look into it.

IMPORTXML example

=IMPORTXML(url, “all the hyperlink titles in the page”)

In the example below, I import the data into cell B3, so that’s the destination to include in the formula.

IMPORTXML 1

The formula I used included the url in quotations as previously mentioned. It uses the Xpath query “//a/@title” to fetch all the hyperlink titles within the article

Give the formula a try! You will be warned that the spreadsheet is trying to send and receive data, just allow it to get all the information.

=IMPORTXML(https://en.wikipedia.org/wiki/Tiger_Woods, “//a/@title”)

Below is what you’ll see as Google Sheets tries to fetch the data. The higher the number of nodes in the structured data file, the longer it will take to import the data.

importxml google sheets 2

While the data import is in progress, the cell in which you typed the formula looks like it generated an error.

However, it isn’t an error, but just a temporary state. This means that the data is loading. Hover your mouse over the cell to see the following description.

importxml google sheets 3

As soon as Google Sheets loads the data, the error disappears. You’ll notice the data extends from the formula cell (B3) to the right and further down.

IMPORTXML 4

This is an auto expanding function, meaning all the values fetched will be included in the cells below it.

It’s important that you keep the expected area of the result clear of any values. Otherwise, IMPORTXML formula will result in a #REF! error.

For example, the following image demonstrates what happens if you have any values within that range.

IMPORTXML 5
=IMPORTXML(url cell reference, “all the h2 headers in the page”)

In this example, in the cell B1, I have stored the url to the page from which I am trying to import the header information.

I’ll use that cell as a reference in the IMPORTXML formula: 

=IMPORTXML(B1,”//h2″)

IMPORTXML 6

As you can see, the data being pulled here is different from the previous example, even though the URL is the same.

This time, I have a list of the headers in the article. This could be useful for a variety of reasons. If there’s other information that you want to import from the URL, I encourage you to read more about XPath queries.

Importxml could not fetch url

Naturally, this error message in Google Sheets usually means that there was a problem fetching the data.

This error can occur due to various reasons, such as:

  1. Invalid URL: Double-check that the URL provided in the IMPORTXML function is correct.
  2. Network issues: Sometimes, this error may be the result of temporary network issues. Just wait a bit and try again.
  3. Access restrictions: The website you’re trying to access may have restrictions in place. If this is the case, there is not much to do unless you can request access.
  4. Page structure changes: If the function worked before but structure of the page changes, the XPath query may no longer work correctly. Update the XPath query to match the new structure of the webpage.

A wrap-up on IMPORTXML

You can also check out the following blog post to learn how to pull data into Google Sheets from a table or a list on an HTML page with the IMPORTHTML Google Sheets function or How to use the IMPORTFEED formula.

Editor’s note: This is a revised version of a previous post that has been updated for accuracy and comprehensiveness.

Ready to streamline your spreadsheet data?

You may also like…