You Have Data! Now What?

Jun 23, 2014

How to Read a Content Inventory

The beauty of running an automated content inventory with CAT, the Content Analysis Tool, is that you give it a URL and tell it to go. A short time later, you have a big set of data that you didn’t have to manually compile. So you’ve saved a lot of time and painstaking effort manually collecting and collating the information, but how do you know what you’re looking at and, most importantly, what to do with it? Here’s a quick overview of the data your inventory contains and what it all means.


Looking at a resource's URL structure enables you to evaluate several things.

Length and clarity: Shorter URLs are better for both human readability and search engine optimization. A long URL may not be rendered by some browsers and is not memorable to a human who may later want to directly type it in.

Identifying and addressing poorly constructed URLs is not only a favor to your human users but gives you the opportunity to improve your site's ranking. URLs composed of session IDs or other parameters provide no information to the user to help set expectations of the content he or she is likely to find at that location. Multiple parameters may also affect whether a page is crawled by search engines like Google.

Navigational structure: It is common to use a content inventory as the basis of a hierarchical site map. If the URLs represent a logical directory structure, you have a great start at creating that map.


The type, or format, of the content—for example, HTML, video, image—is another basic piece of information to identify the overall structure and content mix of your site.

File Size

File size may interest your website management team, who care about the effect of file size on load time and performance.

Metadata: Title, Keywords, Description

Keywords have declined in importance for SEO. The "title" and "description" metadata are still important though. The title appears in the browser and in search results, so it must be unique and descriptive—get those keywords in there!—without being too long (aim for under 70 characters if possible).

The metadata description also appears in search engine results, so review it to see whether it represents the content on the page and whether it is engaging or informative enough to entice readers to click through to the page.


Analytics data give you a good indication of which pages are popular and whether people are spending time on them or immediately leaving.

Use the analytics data to find pages that have little or no traffic, and flag those for review; they may be suffering from ROT (redundant, outdated, or trivial content). You may also have an instance of valuable content that's been orphaned without navigation or buried too deeply—another reason to look at the low numbers as well as the high.

Word Count

If you’re planning a localization project, knowing the word count on each page is a helpful tool for estimating scope and cost. Even if you’re not localizing, it’s a quick way to find very long or very short pages that you may want to review. Short pages with little content may need further development or you may be able to incorporate that content into another page; long pages may need to be broken up or edited for scannability.

Custom Data

Besides the typical inventory elements, many people gather information that is relevant to the project and team and that sets the stage for the content audit. For example, you may add a column to indicate a status (revise, remove, retain), a business owner name, a step in the customer journey that the content maps to, a persona for whom the content would be appropriate, and so on.

H1 Tags

H1 tags are important for search engine optimization. Review them for keyword placement and clarity.

Links In and Out

A good way to assess the site's cross-linking strategy is to look at the links into and out from each page. You can also use the list of links in to find pages that are not linked from expected locations or that are available only from the sitemap or a low-level page.

Images, Media, Documents

In addition to HTML pages, CAT gathers data for the images, media (audio and video files), and documents that appear on your site. Knowing which files are associated with each page helps you plan and track content through a migration process.

Use the dashboard search feature in CAT to find all the pages where a particular image or media file appears if you need to change or remove it.

The Big Picture

All this information and you haven't even left your CAT dashboard!

Once you’ve reviewed your inventory data, you can start to plan the rest of your audit process. You have a good sense of what you’re dealing with, a start on identifying some of the issues you’ll want to investigate further, and some great numbers to share with the people who are scoping the project. 

For a detailed view of how the CAT job setup, job details, and resource details work, take the tour. For more on content inventories and audits, see our Articles section. If you haven't tried CAT yet, sign up for a free trial.

Category: Content Inventory

Paula Land

Paula is co-founder and CEO of Content Insight.

Add Pingback

Please add a comment

You must be logged in to leave a reply. Login »