Review your content quality (or lack thereof) through content inventory
Charlie says: “Content inventories are a great way to get rich data and an actionable overview of all the pages on your site. You can then gain an understanding of all the content that search engines might use to rate you.
This won’t just be for the pages you think you’re putting forward or the ones on your main navigation, but for the entire site that’s available. Though this is a tip that’s existed for a while, it’s more useful than ever in 2023. We’re collectively embracing a new Google world with the helpful content update, product review updates, core updates, etc.”
What does looking at the crawl index structure mean?
“The first step to gathering the content inventory is to understand all the pages you have available so you can pull in data from them. The first part of this is crawling the active website and the pages you have that are public facing. This involves looking at your main website and your XML sitemap, and crawling both to understand the active site structure. Once you’ve got that kind of information, you can expand out to other places where content might be available on the website, but is a bit more hidden away. For example, you can use the index coverage reports in Search Console to understand other pages that are indexed but aren’t in that main active site structure.
You can use your performance reports in Search Console or your analytics package of choice. To see all the pages that have had organic traffic, you should use some of your favourite SEO tools to look at all the pages that have rankings in the search results. You can use a backlink tool like Majestic to understand all the pages that have backlinks, whether they’re active or very old pages.
Speaking of old pages, the other place you can look is archive.org: the ‘Wayback Machine’. You can use the site’s API simply by entering a URL into the browser and it’ll throw back all the pages it’s become aware of over the course of time. This is a great source of potential pages.
You can put all these things together and, suddenly, you have a big list of all the pages that are active on your website right now. You can use this as the basis for understanding what your entry might contain.”
What would you do if a page isn’t indexed and you want it to be indexed? What are the typical reasons why a page isn’t indexed?
“Let’s say you have a website with 5,000 pages, and you notice there are a couple of hundred pages that have been crawled but not indexed. You can spot them and start looking for patterns suggesting why those pages might not be indexed. Is it a content duplication issue, a content quality issue, or something else around the pages? Is it a technical crawling issue? There are many reasons why that might’ve happened. The idea is to have a big inventory of all the pages so you can start spotting areas where things have gone awry.”
Regarding typical areas, why would they not be linking to those pages? Could it be that you have inferior content there?
“Google is becoming more judicious about choosing what it wants to index or not. Before, it felt very much like Google would index everything and let the ranking part of the algorithm determine whether to show things in the search results or not. We’re now finding that Google is choosing not to index as much. This means that when you look at these areas for a client, and spot these areas of content, it tends to be something to do with the content quality.
It could be a technical SEO issue that’s causing near duplicates of pages, or it could be there are a lot of pages with very little content coming through. This could be because it’s an eCommerce site where, if a category has zero products, it’ll still show all the facets for it. It could be down to the fact that it’s low-quality content that’s not adding anything and so isn’t being indexed by Google. The key is to be able to spot areas like this and put your SEO hat on to investigate further.”
How do performance and load speed impact whether a piece of content is likely to get indexed or not?
“Load speed isn’t necessarily an indexing factor, unless you’re suffering from severe problems - for example, if Google decides to limit the crawl rate it’s hitting your website with because it’s finding it’s very slow to load. This can stop them from crawling deeper into the website and will reduce your crawl budget accordingly. Though it’s a rare occurrence, the crawl rate can drop due to the load performance being very poor. Sometimes Google doesn’t crawl as deep into a site, so therefore pages aren’t discovered and can’t be indexed.
Otherwise, page load speed is more useful for Google as a tie-breaking ranking factor that’s applied. It’s great to have it in a content inventory in line with collating multiple data points, backlinks, organic visits, and internal links to see whether this content is indexed or not and whether you’re putting your best foot forward.
If you have a site with multiple sections that have poor loading speed, not many internal links, and you’re not getting many organic visits, rather than assuming it’s poor-performing content you could say the content hasn’t had a chance to succeed because you haven’t linked to it much or it’s not loading quickly. You can then focus on fixing those things, if you think it’ll be valuable to do so, and see if that makes a difference. That’s where page load speed can be useful as a performance indicator factor.”
Are backlinks just external links?
“When talking about pulling the data, most of the time this refers to backlinks and external data. Does this content contribute to the backlink of the profile of the website? Is poor performance perhaps indicative of it being a competitive topic that this page is targeting? Would some backlinks potentially help?
Internal linking is crucially important from a site structure and architecture point of view. Is your content being referred to enough times internally regardless of the big number of visits? Are you positioning yourself to give those internal link signals to Google? Are you suggesting this is valuable content with internal links pointing to it, and the chance for users to discover it themselves through the journey of your website?”
Are you a fan of getting rid of content that isn’t indexed?
“Yes, especially if it’s not indexed because it’s no good, indexed but never shown, or if we’re seeing negative signals. You might also notice poor conversion rates, few people navigating to the page, and low content quality in general.
If the content isn’t serving a purpose and it’s not driving any benefits, how do you measure that? If it’s not doing its job it’s probably not worth keeping. Improve it or retire it in some fashion. You should make your website lean. Avoid having pages for the sake of it. Make the website as lean and efficient as possible. Start chopping things away and focus on the stuff that’s important.”
If you have some blog posts that didn’t have any external backlinks and aren’t getting any traffic, would you probably get rid of them?
“It’s important to think about the goal of the content. If the content isn’t getting traffic and it isn’t driving backlinks or positive signals, when was the content created and who was it created for? What value could you get from keeping it? It’ll still have value if it’s blog content from years ago that can still be targeted to certain keywords, or if it’s useful supplementary content for the blog post you’re creating now. If you want to build expertise you’ll need to show that depth of information, that you’ve been writing about this topic for a while at a high level.
If the content is still valuable and is still serving a purpose even without loads of backlinks, is it not getting traffic because you’re not referencing it enough? Do you need to optimise it better? Is it not working because it’s out of date? If it’s not very good stuff and no one would want to read it anyway, it will become a clear candidate to go. However, don’t throw the baby out with the bathwater.”
Regarding content inventory, how often should this be done?
“You’ll have to do an overview at the beginning. This is important in SEO and will be increasingly important for being aware of everything that’s happening on a website regarding SEO performance. Doing the content inventory at the beginning will give you a month’s worth of actionable ideas to work on. You can look at where you need to improve a section, remove a section, retire something, improve site governance, etc. By doing this at the beginning, you’ll have a big list of words and can monitor your website more effectively.
There are website tools that monitor your website and tell you when pages have been added and deleted. These tools are becoming increasingly important, especially when combined with the content and venture at the beginning. You’ll have a benchmark position to start from and understand everything you should be working on. Once you’ve done it well in the beginning (if you’ve got a strong monitoring system in place), you won’t need to run it again for a while because you’ll be aware of the new pages that have been added, deleted, or changed. You might not need to run it again for 18 months, but it’s important to play to what your setup is.”
What shouldn’t SEOs be doing in 2023? What’s seductive in terms of time, but ultimately counterproductive?
“The temptation is to follow a process-driven or assembly-line SEO strategy for every client. This is the counter of a content inventory. Some SEOs want to fit everything neatly into a box and follow exact processes and procedures.
Sometimes that can work, but it’s difficult to apply a universal approach because everyone’s tech stack is different, everyone’s goals are different, and everyone’s publication processes are different. The idea of assembly-line SEO is seductive. It’s tempting because you will have a framework to go through. However, being bespoke to the actual needs of your customer is more important.”
Charlie Williams is an SEO and Content Strategy Consultant at Chopped Digital and you can find him at chopped.io.