Ever felt that some of your keywords don’t perform as well as before? And are you wondering why other companies outrank you in Google? Then it might be smart to do a quick competitive analysis. Because in most cases, it’s not necessarily your site that’s performing worse; it’s other sites doing better. Luckily, if you want to do a competitive analysis to optimize your SEO efforts, there’s actually a lot you can do yourself. Let us take you through the steps!
Step 1: Define your keywords
It’s very important to use the right keywords in a competitive analysis. If you insist on using your (possibly branded) company outing as one of the main keywords, you might not have any competition, but you also won’t get any decent, organic traffic to your website. An example: Let’s say you’re offering ‘holiday homes’. If you insist on using the keyword ‘vacation cottage’, you are selling yourself short. Match the words your customers use.
Use Google Trends to get insights into how your keywords are used
Doing proper keyword research will help. Not just for this competitive analysis, but for the entire SEO optimization of your website!
Step 2: Analyze these keywords
Once you have defined the keywords you’d like to check against your competitors, the next step is obvious: search for these keywords. See who your competitors are by writing down who ranks higher than you.
Be realistic
If you’re on page two in Google and want to do a competitive analysis with the number one, there is probably a lot to gain. But you should keep two things in mind. First, your rankings probably won’t immediately shoot to the first spot. They’ll most likely go up step by step. And second, the high-ranking web pages, depending on the keywords, might have a higher marketing budget than you to back their ranking strategies. In fact, this could be the reason why they rank so high in the first place!
But don’t give up. Our mission is ‘SEO for everyone‘ for a reason. If you put in the work, you’ll be able to climb to higher rankings step-by-step. Check the keywords, then make them long-tail or add local keywords (city name, region name) if needed. Do a thorough analysis. Google Trends will tell you what keywords have more traffic in the target markets for your business, and (free/paid) tools like Ahrefs.com and Searchmetrics.com will give you even more keyword insights. You can even use the Semrush integration in Yoast SEO to find relevant related keywords that might attract traffic.
Climbing up in rankings a (few) step(s) at a time
Sometimes, you can achieve a big improvement in your rankings. But if your website is ranking number six, it’s easier to climb to spot five or four before you target the top three. Again, that top three probably has the marketing budget to go all out, whereas your immediate ‘ranking neighbors’ are struggling like you. Beat them first; it’s easier. Having said that: if you have the opportunity to dethrone number one, two, or three, of course, go ahead and do so.
Step 3: Check technical differences
You’ll need to check a number of things to determine on which aspects your competition is ahead of you. That’s why the next step of your competitive analysis is to see if there are any technical differences.
Site speed
The faster the site, the happier the visitor, and the happier the search engine. That’s why it’s important to look at speed insights when doing a competitive analysis. Speed insights will tell you if there is a huge difference between you and your main competitors.
Again, there are multiple ways to check SLL/HTTPS in a competitive analysis. You can get a nice overview with Builtwith.com. This site gives you a ton of technical information, including an SSL certificate. You can obviously check your browser’s address bar for this as well, but Builtwith could give you more insights while going over all other details. Like what CMS your competitor uses (and if they upgraded their WordPress install and you didn’t?).
Mobile site
Mobile-first. Mobile parity. Mobile UX. It’s all about mobile these days. Which makes sense, as most of today’s website traffic is from mobile devices, a few exceptions aside.
A good mobile website is about getting your visitor to the right page as soon as possible. This has to do with speed, a clear and pleasant branded design, and deciding about top tasks on your website. Go check the websites of your competitors and see where they are clearly outperforming you. Be sure to check your Core Web Vitals as well, as Google is paying a lot of attention to these. To test this, you can use for instance:
Although technical optimizations are crucial, the quick wins will probably be in the field of content. Look at what you’ve written about your company and products, then see what your competitors published on their sites.
Click all menu items
What are your competitors’ main pages? What are they trying to sell, and how did they manage to rank above you? See how focused their menu is and what pages they link to from there. Check if your competitor tells a better story than you. Then improve your story. The main menu of your website should be targeted at your visitor; it doesn’t have to explain all the awesome things you came up with.
Category pages or product pages
If you have a shop, it might be interesting to do a competitive analysis of your competitor’s store structure. Are they trying to persuade customers on a product page or on category pages?
Our advice: optimize and try to rank for most of your category pages. After all, in a market where there are a gazillion products, ranking for each and every one of them is tough! So write appealing, high-quality content for your category pages, make them your cornerstone content, and try to rank a lot of ’em. Your competitive analysis will tell you which of these pages are optimized by your main competitors. Optimize yours accordingly and, obviously, better.
A sitemap can show you the site structure of your competitor, be it via an HTML sitemap or an XML sitemap. It can tell you, for instance, if they are targeting certain long-tail keywords via the slugs of their pages. Plus, a few clicks to their pages will tell you how their internal linking is done.
You can find that sitemap on most sites at example.com/sitemap.xml or example.com/sitemap_index.xml or at example.com/sitemap. Sometimes a website doesn’t have a sitemap, but tools like Screaming Frog and Sitebulb might help you out. Crawl the site and order by URL.
Blog
Do you have a blog? If not, you probably should. A blog makes for dynamic content, and keeps your site current. And, if you post regularly, Google will find all kinds of interesting and recent ‘Last Updated’ dates.
Check if your competitor has a blog, and if theirs ranks better than yours. If so, they’ve probably woven their blog into their content strategy.
Step 5: Compare UX
Great UX makes for a better time on your site, more page views, and a lower bounce rate. We’re not getting into UX too much here, because we think you should first focus on other things in your competitive analysis. However, we wanted to highlight two things: call to action and contact pages.
Call to action
A great call to action helps any page. Whether it’s to drive sales or engagement, every page needs a proper call to action. Simply go over some of your competitor’s pages and see how they went about this. See if you can grab some ideas, and improve your own call to action. Oh, and remove that slider and/or video background. That’s not a call to action. That’s a call to no action. (If you really must include one, make sure you at least optimize your video background in the right way).
Contact page & address details
Your contact page and address details could be the end goal of a visit to your page. If so, check how the competition created that page. Did they add structured data, for instance? Is there a contact form? Did they make it easier to find these details than you did? If comparing this sparks some great ideas, then adjust your site accordingly.
Step 6: Perform a backlink analysis
Last but not least: if all seems reasonably the same, and there is no logical way to explain why your competitor outranks you, it might just be that the other website has a great deal more relevant links than you do. Or simply better ones. You’d have to check Ahrefs.com, Moz’s OpenSiteExplorer or, for instance, Searchmetrics for this.
Follow-up on your competitive analysis!
At this point, you know the main differences between your competitor’s site and yours. This is the moment where you start prioritizing optimizations and get to work. First, take care of low-hanging fruit. Fix things that are easily fixed. Next, determine what issues might have the biggest impact on your rankings, then solve these as well. If you are a regular visitor to this blog, you’ll probably have no problem with this. Our tip? Go for any speed and content issues first, and try to get more backlinks in the process.
A redirect happens when someone asks for a specific page but gets sent to a different page. Often, the site owner deleted the page and set up a redirect to send visitors and search engine crawlers to a relevant page — a much better approach than serving them annoying, user-experience-breaking 404 messages. Redirects play a big part in the lives of site owners, developers, and SEOs. So, let’s answer a couple of recurring questions about redirects for SEO.
1. Are redirects bad for SEO?
Are redirects bad for SEO? The answer is no; redirects are not inherently bad for SEO. However, it is crucial to implement them correctly to avoid potential issues. An improper implementation can lead to problems such as losing PageRank and traffic. Redirecting pages is necessary when making URL changes, as you want to preserve the hard work invested in building an audience and acquiring backlinks.
To ensure that redirects are implemented correctly and effectively, consider the following best practices:
Use the appropriate redirect type: The most commonly used redirect for permanent URL changes is the 301 redirect. This informs search engines that the original URL has permanently moved to a new location. By using a 301 redirect, you can maintain the ranking and relevance of the old URL and seamlessly redirect users and search engine crawlers to the new URL.
Update internal links: When you implement redirects, updating any internal links on your website that refer to the old URLs is important. This ensures visitors can navigate to the correct pages and search engines can properly index the new URLs.
Preserve user experience: Redirects should aim to provide a smooth user experience. Avoid excessive redirect chains, which can slow page load times and frustrate users. It’s also important to redirect users to relevant content that aligns with their intent. For example, if a page has been permanently removed, redirect users to a relevant alternative rather than a generic homepage.
Monitor and test redirects: Regularly monitor your redirects. Check for errors or issues, such as broken redirects or redirect loops. It’s also helpful to periodically test the redirects to ensure they function as expected.
2. Why should I redirect a URL?
By redirecting a changed URL, you send users and crawlers to a new URL, minimizing annoyance. Whenever you perform any maintenance on your site, you are taking stuff out. You could delete a post, change your URL structure, or move your site to a new domain. You must replace it, or visitors will land on those 404 pages.
If you make small changes, like deleting an outdated article, you can redirect that old URL with a 301 to a relevant new article or give it a 410 to say that you deleted it. Don’t delete stuff without a plan. And don’t redirect your URLs to random articles that don’t have anything to do with the article you’re deleting. Lastly, don’t 301 redirect all your 404s to your homepage!
Bigger projects need a URL migration strategy. For instance, moving to a new domain or changing the URL paths. In these cases, you should look at all your site’s URLs and map them to their future locations on the new domain. After determining what goes where you can start redirecting the URLs. Use the change of address tool in Google Search Console to notify Google of the changes.
3. What is a 301 redirect? And a 302 redirect?
A 301 redirect is a permanent redirect informing visitors and search engine crawlers that the requested URL has moved to a new destination permanently. It is the most commonly used redirect for permanent URL changes. When implementing a 301 redirect, you signal that the old URL is no longer in use and that the new URL should be accessed instead. It is important to note that with a 301 redirect, the old URL should not be used again in the future, as it signifies a permanent change.
On the other hand, a 302 redirect is a temporary redirect. This type of redirect is used to indicate that the requested content is temporarily unavailable at a specific address but will return at a later time. Unlike a 301 redirect, a 302 redirect suggests that the change is temporary and that the original URL may be used again.
You must consider the URL change’s nature when deciding which redirect to use. If the change is permanent and you have no intention of using the original URL again, a 301 redirect is appropriate. However, if the change is temporary and you plan on returning to the original URL, a 302 redirect should be used.
It is recommended to carefully consider the purpose and longevity of the URL change when selecting the appropriate redirect. If you are uncertain about which redirect you need, please read our article on which redirect to pick?
4. What’s an easy way to manage redirects in WordPress?
We might be biased, but we think the redirect manager in our Yoast SEO Premium WordPress plugin is incredibly helpful. We know that many people struggle to understand the concept of redirects and the work that goes into adding and managing them. That’s why one of the first things we wanted our WordPress SEO plugin to have was an easy-to-use redirect tool. I think we succeeded, but don’t take my word for it.
The redirect manager can help set up and manage redirects on your WordPress site. It’s an indispensable tool to keep your site fresh and healthy. We made it as easy as possible. Here’s what happens when you delete a post:
Move a post to the trash
A message pops up saying that you moved a post to the trash
Choose one of two options given by the redirects manager:
Redirect to another URL
Serve a 410 Content deleted header
If you pick redirect, a modal opens where you can enter the new URL for this particular post
A redirect checker is a tool to determine if a certain URL is redirected and to analyze the path it follows. You can use this information to find bottlenecks, like a redirect chain in which a URL is redirected many times, making it much harder for Google to crawl that URL — and giving users a less-than-stellar user experience. These chains often happen without you knowing it: if you delete a redirected page, you add another piece. So, you need to keep an eye on your redirects; a redirect checker is one of the tools to do that.
You can use one of the SEO suites, such as Sitebulb, Ahrefs or Screaming Frog to test your redirects and links. If you only need a quick check, you can also use a simpler tool like httpstatus.io to give you an insight into the life of a URL on your site. Another must-have tool is the Redirect Path extension for Chrome, made by Ayima.
6. Do I need to redirect HTTP to HTTPS?
Every site should use the HTTPS protocol, but be sure to redirect your HTTP traffic to HTTPS. You could get into trouble with Google if you make your site available on HTTP and HTTPS, so watch out for that. Google prefers HTTPS sites because these tend to be faster and more secure. Your visitors expect the extra security as well.
So, you need to set up a 301 redirect from HTTP to HTTPS. There are a couple of ways of doing this, and you must plan this to ensure everything goes as it should. First, the preferred way of doing this is at the server level. Find out what kind of server your site is running (NGINX, Apache, or something else) and the code needed to add to your server config file or .htaccess file. Your host will often have a guide to help you set up a redirect for HTTP to HTTPS on the server level. Some hosts have a simple setting to manage this in one go.
There are also WordPress plugins that can handle the HTTPS/SSL stuff for your site, but for this specific issue, we wouldn’t rely on a plugin, but manage your redirect at a server level. Don’t forget to let Google know of the changes in Search Console.
Redirects for SEO
There are loads of questions about redirects to answer. The redirect concept isn’t too hard to grasp if you think about it. Getting started with redirects isn’t that hard, either. The hard part of working with redirects is managing them. Where are all these redirects leading? What if something breaks? Can you find redirect chains or redirect loops? Can you shorten the paths? You can gain a lot from optimizing your redirects, so dive in and fix them. Do you have burning questions about redirects? Let us know in the comments!
Master the art of using Schema.org to elevate your online visibility with our ultimate guide to structured data. Dive into the heart of Schema.org and how it can revolutionize how your site interacts with search engines like Google. Explore its power in improving the presentation of your pages when describing products, reviews, events, and recipes. Discover how to get rich results such as snippets, interactive mobile results, voice-activated actions, or securing a spot in Google’s coveted Knowledge Graph. Embrace structured data — your ticket to better online exposure and interaction.
Structured data is a way of describing your website to make it easier for search engines to understand. You need a so-called vocabulary to make it work, and the one used by the big search engines is Schema.org. Schema.org provides a series of tags and properties to describe your products, reviews, local business listings, job postings, et cetera in detail.
The major search engines, Google, Bing, Yandex, and Yahoo, developed this vocabulary to reach a shared language to understand websites better. Search engines use it today for many things, from fact-checking content to listing job postings!
If added well, search engines can use the applied structured data to better understand your page’s contents. As a result, your site might be presented better in search results, for example, in the form of rich results like rich snippets. However, there are no guarantees you’ll get rich results — that’s up to the search engines.
Below is an example of a simple structured data using Schema.org in JSON-LD format. This is a basic schema for a product with review properties. This code tells search engines that the page is a product (Product); it provides the name and description of the product, pricing information, the URL, plus product ratings and reviews. This allows search engines to understand your products and present your content in search results.
Structured data, particularly when using the Schema.org vocabulary, breathes life into your site for search engines. It describes your products, reviews, events, job postings, and more in a language that search engines instantly understand. The beauty of structured data lies in its precision and detailed presentation of your site’s content. Gone are the days when search engines had to make guesses about your content: with structured data, every site element is deciphered clearly.
Structured data is crucial because it can outline clear connections among diverse website components. It fosters a new understanding for search engines, helping them see your site’s content and how everything relates. It’s a roadmap of your site’s content, with each piece connected and important to the bigger picture.
In a world where clarity equals visibility, structured data is no longer nice but necessary. By applying structured data, you speak the language of search engines, augmenting your website’s comprehensibility and attracting more organic traffic.
Is structured data important for SEO?
Implementing structured data using Schema.org is a strategic move in bolstering your website’s SEO. While it may not directly improve your site’s rankings, it enriches search result listings, making your site a more appealing click to prospective visitors.
Envision your search result as a movie trailer: The preview captivates the audience and compels them to watch it. An enhanced search result crafted with structured data offers a similar advantage. It gives searchers a more detailed, enriched preview of your website, significantly increasing the likelihood of being chosen from a sea of links. If your website delivers on what the enhanced listing promises, congratulations – you’ve just become a reputable source for your visitor. This user satisfaction translates into a lower bounce rate, signaling to search engines like Google that your site is a credible and reliable resource.
Moreover, with structured data gradually gaining traction, now is the perfect opportunity to leapfrog your competitors. It’s not just about keeping pace in the SEO race; it’s about being a frontrunner. Our structured data guide is designed to equip you with pragmatic tips and recommendations to maximize your website’s potential using structured data
Structured data can lead to rich results
By describing your site for search engines, you allow them to do exciting things with your content. Schema.org and its support are constantly developing, improving, and expanding. As structured data forms the basis for many new developments in the SEO world — like voice search –, there are bound to be more shortly. Below is an overview of the available rich search results; examples are in Google’s Search Gallery. At the moment, these are a couple of the available rich results:
Article
FAQ
Q&A
Book
Home activities
Recipe
Breadcrumbs
How-to
Review snippet
Carousel
Image metadata
Sitelinks searchbox
Course
Job posting
Software app
Critic review
Learning video
Speakable
Dataset
Local business
Subscription and paywalled content
Education Q&A
Logo
Video
Employer aggregate rating
Math solver
Estimated salary
Movie
Event
Practice problem
Fact check
Product
The rich results formerly known as rich snippets
Rich results are your golden ticket to creating dynamic, engaging, and information-packed search result listings. They are much more than the standard, black-line meta description text on a search engine results page. Harnessing the power of rich results enriches search listings with additional information and interactive functionalities.
Some listings offer extra information, like star ratings or product details
Consider rich results as added-value details that elevate a user’s search experience. From showing critical product data such as pricing and reviews to practical navigational tools like breadcrumbs or in-site search functions, rich results make your listing stand out in a competitive digital landscape.
Where a conventional search result offers a glimpse into your site, a rich result is akin to rolling out a red carpet, enticing users with a premium overview of what they can expect when they click through. These enriched listings can effectively boost click-through rates (CTR) and enhance user interaction, providing an SEO advantage that improves your site’s visibility and drives more traffic.
In today’s mobile-driven world, rich results are pivotal in shaping a distinctive and interactive search experience. Rich results find a particular resonance in mobile searches, becoming more prevalent and impactful. Specific searches for local restaurants, recipes, movies, how-tos, and courses benefit from a specialized treatment in mobile search results.
Tasty, right?
They are often presented in a touch-friendly and user-engaging, swipeable manner, making rich results intuitive and streamlined on mobile devices. This format, frequently called the carousel, significantly enhances the user experience and the ease of information access.
Google significantly emphasizes fostering rich, interactive elements within these results. With Google’s touch of innovation, you can conveniently reserve a table at your favorite restaurant, order movie tickets, find delectable cheesecake recipes, or even book flight tickets— all directly from the search results. Google’s advancements have turned the humble search engine results page into a powerhouse encompassing almost every aspect of daily life, with structured data propelling parts of it.
With structured data and rich results, your website gains the potential to offer more than just links and text — it becomes more visible to the user. It elevates the user experience by leaps and bounds, improving your site’s visibility and creating potential for greater user engagement. Remember, the precision and dynamism offered by structured data and rich results are still in the early adoption phases across the web. Therefore, harnessing them presents a lucrative opportunity to gain a competitive edge. And from the looks of it, we’re just beginning to scratch the surface of potential here.
Knowledge Graph Panel
When you search Google, you’ll commonly see a large box of detailed information on the right side. This dynamic feature, Google’s Knowledge Graph Panel, provides an enriched snapshot of information tied to your specific search.
A knowledge panel
So, how does Google amass this information? It systematically evaluates related content about the subject in question, with structured data from a website being a significant resource. This exhaustive examination of interconnected data helps to unveil a more holistic picture of the search subject.
Imagine you’re a verified business or an authority on a particular subject. The Knowledge Graph Panel can showcase your name, logo, and social media profiles. This visibility warrants a sense of prestige and credibility conferred by Google.
But the implications of the Knowledge Graph Panel go beyond just surface-level information. Linking to a multitude of related content creates a comprehensive web of knowledge that helps users delve deeper into their areas of interest. This enhances user experience and increases the time spent on Google services, making it a win-win feature for both ends.
Moreover, the Knowledge Graph’s influence extends to SEO strategy. A billboard showcasing your site’s relevance and authority is featured in search results. It underscores the importance of structured data in shaping a website’s digital visibility and underlines how optimized, high-quality content can pave the way for enriched search results.
Featured snippets
This might be a sneaky addition because featured snippets are rich results, but they do not get their content from structured data. A featured snippet answers a search question directly in the search results but uses regular content from a web page to do so.
A featured snippet for the search term [site structure]
Does structured data work on mobile?
Yes, the results of implementing structured data work everywhere. Mobile is one of the places where the results of a Schema implementation are most visible.
If a page meets the criteria Google sets, you can now book movie tickets or reserve a table at a restaurant directly from the search results. If you implement structured data correctly, you could also be eligible for several interactive extras on the mobile search results pages.
Different kinds of structured data
Sitelinks Searchbox
A Searchbox is where the internal search engine of a site is presented within the search results of Google. Google uses Schema.org code for this as well. Yoast SEO has support for this built in, and there’s more info in our Knowledge Base.
If you look at the Schema.org website, you’ll notice a lot of information you could add to your site as structured data. Not everything is relevant, though. Before implementing structured data, you need to know what you should markup. Do you have a product in an online store? Do you own a restaurant? Or do you have a local business providing services to the community? Or a site with your favorite cheesecake recipes? Whichever it is, you need to know what you want to do and explore the possibilities. Don’t forget to check the documentation by search engines to understand what they need from you.
Yoast SEO does a lot of these
Yoast SEO has Schema controls, which help translate your content into a language search engines understand and appreciate. It automatically generates structured data for your site with sensible default settings, which you can also manually adjust based on over twenty supported content types. This granular control over your Schema settings can increase your chance of obtaining coveted rich results.
For instance, our Schema tab lets you specify your contact page as a ContactPage, removing potential ambiguities for search engines. Beyond this, Yoast SEO makes automatic connections that guide search engines in deciphering the meaning of your site. We also provide additional features that enhance Schema elements and center around content specificity, all contributing to a cohesive structured data strategy.
Yoast SEO helps you fine-tune your schema structured data settings per page
Creative works
The Creative Work group encompasses all creative things produced by someone or something. You’ll find the most common ones below, but the list is much longer. You’ll also find properties for sculptures, games, conversations, software applications, visual artworks, and much more. However, most of these properties don’t have a rich presentation in search engine results, so they are less valuable. But, as mentioned earlier, if your site has items in the categories below, mark them up with Schema.org.
Articles
An article could be a news item or part of an investigative report. You can distinguish between a news article, a tech article, or even a blog post.
Books
A book is a book, whether in paper form or digital form, as an eBook. You can mark up every property type, from the author who wrote it to any awards it has won.
Google understands structured data for datasets and can use this to help surface and understand these datasets better. Find out more on Google’s developer pages.
FAQ pages
Make a great FAQ page to answer your customers’ frequently asked questions. Yoast SEO helps you turn those FAQs into something Google can understand properly, with the Schema structured data FAQ content block.
How-tos
You can markup your how-to articles with HowTo structured data. Our structured data content block for how-tos is in the WordPress block editor. By following a step-by-step process, users can get a specific task done.
Image metadata
By incorporating detailed image metadata, Google Images can provide a more comprehensive understanding of the image, including details about the creator, usage permissions, and acknowledgment specifics.
Music
Music also gets the structured data treatment. There are a couple of Schema.orgs of interest for music, like MusicRecording, MusicAlbum, MusicEvent, and MusicGroup.
Q&A pages
Question and answer pages are eligible for rich results as well. According to Google, Q&A pages differ from FAQ pages, where you can find multiple questions and answers on a page — more in Google’s Q&A page documentation. Use the Yoast SEO structured data content blocks to provide structured data for your FAQ pages.
Recipes
Adding Recipe structured data to the recipes on your cooking website lets you get your recipes featured directly in search results. Moreover, with mobile rich results, recipes look great on mobile featuring great images — if you add them. And that’s not all, because you can now send your recipes to Google Home and get Assistant to speak it out loud. How cool is that?
Speakable
Google is currently testing the implementation of speakable Schema.org. With this code, you can tell a search engine that a piece of content is specially written to be spoken aloud by digital assistants like Alexa, Siri, Cortana, or Google Assistant.
TV & Movies
Movies and TV shows get their piece of structured data as well. Searching for a movie in search engines will yield a rich result with reviews, poster art, cast information, and even the ability to order tickets for a showing directly. You can even mark up lists of the best movies ever made or your favorite TV shows.
Videos
It’s possible to do all kinds of interesting things with video. Google, in particular, is working on new ways to display videos in the search results.
Commerce
There is also structured data for commercial goals. Here, you’ll find a couple of important ones:
Events
Marking up your event listings with the correct Event Schema.org, might lead to search engines showing your events directly in the search results. This is a must-have if you own a nightclub, a venue, or any business regularly organizing events.
Businesses and organizations
If you make money with your website, you might own a business. If you’re a site owner or work on a company site, you’ll find the business and organization Schema.orgs interesting. Almost every site can benefit from the correct business Schema.org. If you do it well, you could get a nice Knowledge Graph or another type of rich listing in the search engines. You can even add special structured data for your contact details so customers can contact you directly from the search results. For local businesses, our Local SEO plugin helps you take care of all your local structured data needs.
Have job postings on your site? Mark them up with the job postings structured data to have them show up nicely in Google. We use this on all our Yoast job postings as well.
Products
Schema.org for products is almost as important as the one for businesses and organizations. Using Product Schema.org; you can give your products the extra data search engines need to give you rich snippets, for example. Consider all the search results with added information, like pricing, reviews, availability, etc. This should be a significant part of your structured data strategy if you have products. Remember to mark up your product images. Our WooCommerce SEO and Shopify SEO products output proper product structured data.
Reviews and ratings play an important role in today’s search process. Businesses, service providers, and online stores all use reviews to attract more customers and show how trustworthy their offer is. Getting those five stars in search engines might be the missing link to creating a successful business.
Voice assistants are interesting, but they have yet to reach the adoption levels they were predicted to have. Still, there’s stuff happening on this front if you see it as part of the conversational search movement. Take recipes, for instance; you can send a recipe from the search results to your Google Home to read aloud while cooking. These are called Actions, and there are a whole bunch of them. If you want your recipes to appear in the Google Assistant library, add a specific structured data set and adhere to additional rules. You can find more on that on the Creating a recipe action page. Visit Google’s Assistant site to get a feel for what’s possible (a lot!).
Google Assistant uses a lot of structured data to understand your content
The technical details
To start marking up your pages, you must understand how Schema.org works. If you look closely at the full specs on Schema.org, you’ll see a strict hierarchy in the vocabulary. Everything is connected, just like everything is connected on your pages. Scroll through the list to see all the options, and note the ones you need.
Let’s look at the structure. A Schema.org implementation starts with a Thing, the most generic type of item. A Thing could be a more specific type of item, for instance, a Creative Work, an Event, Organization, Person, Place, or Product.
For example, a movie is a “Thing†and a “Creative Workâ€, which falls under the category “Movieâ€. You can add a lot of properties to this, like a “Descriptionâ€, a “Directorâ€, an “Actorâ€, a poster “Imageâ€, “Duration†or “Genreâ€. There are loads to add, so you can get as specific as you want. However, don’t overdo it since not all search engines use every property – at least not yet. For instance, you should look at the specifications in Google’s documentation to see which properties are required and which are recommended.
A sample Schema.org structure
If we put what we know now in a hierarchy, this is what you will end up with:
Thing
Creative Work
Movie
Description (type: text)
Director (type: person)
Actor (type: person)
Image (type: ImageObject or URL)
etc.
If it would be a local business, you could use something like this:
Thing
Organization (or Place)
LocalBusiness
Dentist
Name
Address
Email
Logo
Review
etc.
For local businesses, you could pick a more specific type of business. This makes it easier for search engines to determine what kind of business you own. There are hundreds of types of local businesses, but your business might not fit one of the descriptions. Using the Product Types Ontology you can get more specific information if your listing is too broad.
In the local business example, you’ll see that Google lists several required properties, like your business’s NAP (Name and Phone) details. There are also recommended properties, like URLs, geo-coordinates, opening hours, etc. Try to fill out as many of these as possible because search engines will only give you the whole presentation you want. You’ll find our Local SEO plugin very helpful if you need help with your local business markup.
What do you need to describe for search engines?
Looking at Schema.org for the first time is daunting. The list is enormous, and the possibilities are endless, so it’s easy to become overwhelmed. To get over this, you need to go back to basics. Think about your site, business, or product and write down the specifications and properties you feel are necessary, then work up. Also, Yoast SEO covers the most essential properties automatically — so there is no need to worry about those if your plugin is configured correctly!
Having said that, there are a couple of sections you should prioritize in your plan to add structured data to your site. If you start with these three, you’ll have the basics covered, and then you can build on that. You should start with structured data for your business details, products, and reviews. These will have the biggest effect in the short term.
How to implement structured data
Don’t be frightened, but here comes the technical part of the story. Before we do that, we’d like to remind you that Yoast SEO comes with an excellent structured data implementation. It’ll automatically handle most sites’ most pressing structured data needs. Of course, as mentioned below, you can extend our structured data framework as your needs become bigger.
Do the Yoast SEO configuration and get your site’s structured data set up in a few clicks! The configuration is available for all Yoast SEO users to help you get your plugin configured correctly. It’s quick, it’s easy, and doing it will pay off. Plus, if you’re using the new block editor in WordPress you can also add structured data to your FAQ pages and how-to articles using our structured data content blocks.
Thanks to JSON-LD, there’s nothing scary about adding the data to your pages anymore. This JavaScript-based data format makes it much easier to add structured data since it forms a block of code and is no longer embedded in the HTML of your page. This makes it easier to write and maintain, plus both humans and machines better understand it. If you need help implementing JSON-LD structured data, you can enroll in our free Structured data for beginners course, our Understanding structured data course or follow a high-level course on Google’s Codelabs.
Structured data with JSON-LD
JSON-LD is the preferred method of adding structured data to your site. However, not all search engines have been quick to adopt it — Bing being the last hold-out. Thankfully, Microsoft came around in August 2018 and now supports this, as the most efficient method.
Since Yoast SEO 11.0, the plugin comes with a fully-featured Schema.org implementation. Yoast SEO now creates a structured data graph for every page on your site, interconnecting everything. While working on this, we’ve also created complete, detailed documentation on Schema, including a specification for integrating structured data. You’ll find some example graphs for various standard pages on your site.
The old ways: RFDa and Microdata
The classic way of writing structured data for inclusion on your pages involves directly embedding it into your HTML. This made a really inefficient and error-prone process and is much of the reason why the uptake of Schema.org hasn’t been swift. Writing and maintaining it via RDFa or Microdata is a pain. Believe us, try to do as much as you can in JSON-LD.
Microdata needs itemprops to function, so everything has to be inline coded. You can instantly see how that makes it hard to read, write and edit.
Tools for working with structured data
Yoast SEO automatically handles much of the structured data in the background. You could extend our Schema framework, of course — see the next chapter –, but if adding code by hand seems scary, you could try some of the tools listed below. If you need help with how to proceed, ask your web developer for help. They will fix this for you in a couple of minutes.
Yoast SEO uses JSON-LD to add Schema.org information about your site search, your site name, your logo, images, articles, social profiles and a lot more to your web pages. We ask if your site represents a person or an organization and adapt our structured data based on that. Also, our structured data content blocks for the WordPress block editor make it easy to add structured data to your FAQs and How-to’s. Check our the structured data features in Yoast SEO.
The Yoast SEO Schema structured data framework
Implementing structured data has always been challenging. Also, the results of most of those implementations often needed improvement. At Yoast, we set out to enhance the Schema output by millions of sites. For this, we built a Schema framework — ready to be adapted and extended by anyone. We combined all those loose bits and pieces of structured data that appear on many sites, improved these, and put them in a graph. By interconnecting all these bits, we offer search engines all your connections on a silver platter.
See this video for more background on the schema graph.
Of course, there’s a lot more to it. We can also extend Yoast SEO output by adding specific Schema pieces, like how-tos or FAQs. We built structured data content blocks for use in the WordPress block editor. We’ve also enabled other WordPress plugins to integrate with our structured data framework, like Easy Digital Downloads, The Events Calendar, Seriously Simple Podcasting, and WP Recipe Maker, with more to come. Together, these help you remove barriers for search engines and users, as it has always been challenging to work with structured data.
Expand and improve your structured data implementation
You’ll need to follow a structured and focused approach to effectively implement and enhance Schema.org markup on your website. This includes understanding the key concepts, identifying your goals, leveraging the right tools, and regularly reviewing your strategy. Here’s a guide on how you can do this:
Understanding Schema.org Markup: First, you need to understand what Schema.org markup is and why it’s crucial for your SEO strategy. Yoast’s developer portal provides a detailed insight into the functional approach of constructing Schema.org markup. This will help you to comprehend its importance in generating rich results on search engines.
Selecting the right format: Choosing the right format for your structured data is critically important. Yoast’s approach recommends JSON-LD as the preferred format for structured markup. According to their Technology and Approach page, JSON-LD provides usability and efficiency in conveying structured information, making it a recommended format by major search engines, including Google.
Integrating with Yoast’s structured data framework: To seamlessly add Schema.org markup to your web pages, you can use our structured data framework. Yoast’s Schema Integration Guidelines provide an easy and beneficial way to integrate Schema.org markup, optimize communication with search engines, and potentially improve its SEO performance.
Reviewing and enhancing your implementation: To keep your structured data markup implementation effective, reviewing and enhancing it regularly is advisable. Not only does this help in identifying any potential issues, but it also presents opportunities to improve your existing markup for better SEO performance.
Read up
By following the guidelines and adopting a comprehensive approach, you can successfully get structured data on your pages and enhance the effectiveness of your schema.org markup implementation for a robust SEO performance. Read the Yoast SEO Schema documentation to learn how Yoast SEO works with structured data, how you can extend it via an API, and how you can integrate it into your work.
Several WordPress plugins already integrate their structured data into the Yoast SEO graph
Most search engines have their developer center where you can find more information on the inner workings of their structured data implementations. You can read these to see what works and what doesn’t. It would be best to stick to their rules because a bad Schema.org implementation could lead to a penalty. Always check your code in the structured data test tool to see if it’s correct. Fix errors and regularly maintain the code on your site to see if it is still up to scratch.
You can’t run away from structured data anymore. If your site means anything to you, you should look into it and figure out the best way to use Schema.org. Still need some help? Read more on how Yoast SEO makes it easy for you. Or check out our digital story on rich results, structured data and Schema, or take our free structured data for beginner’s online training course. If implemented correctly, it can do great things for your site, now and in the future. Search engines are constantly developing new ways to present search results, and more often than not, they use Schema.org data.
Adding structured data to your site is an essential part of technical SEO. If you’re wondering how fit your site’s overall technical SEO is, take our technical SEO fitness quiz to find out. This quiz helps you figure out what you can still work on!
If you use Yoast SEO on your site, you’re probably familiar with features like the SEO analysis or the snippet preview. You might also know our inclusive language analysis, and how easily you can link to related posts or create redirects in the premium version of the plugin. But there’s (much) more! For instance, the Yoast SEO plugin has so-called hidden features. You won’t find them in your settings, but they do great work. Today, we’ll dive into these hidden features: which ones do we have and how do they lighten your load?
Why hidden features?
You can optimize a website in many different ways. Imagine having a toggle for all these options! That’s why, when developing our Yoast SEO plugin, we decided not to translate all these options into settings. If we believe something is beneficial for every Yoast SEO user, we turn the feature on. We call these features hidden features because as a user you’re not necessarily aware of their existence. You might even think we don’t have certain features because there’s no setting for it. But the opposite is true! We’re quietly taking care of things for you.
The hidden features of Yoast SEO
To help you understand what Yoast SEO does for your website in the background, we’ve listed some of the hidden features for you below. Let’s go through them one by one!
1. A structured data graph
Yoast SEO outputs a fully-integrated structured data graph for your posts and pages. But what is a structured data graph? And how does it help you optimize your site? To answer these questions, you first need to know what Schema is.
A few years ago, search engines came up with something called Schema.org to better understand the content they crawl. Schema is a bit like a glossary of terms for search engine robots. This structured data markup will help them understand whether something is a blog post, a local shop, a product, an organization or a book, just to name a few possibilities. Or, whether someone is an author, an actor, associated with a certain organization, alive or even a fictional character, for instance.
For all these items there’s a set of properties that specifically belongs to that item. If you provide information about these items in a structured way – with structured data – search engines can make sense of your site and the things you talk about. As a reward, they might even give you those eye-catching rich results.
How does the Yoast SEO plugin help?
Adding structured data to your site’s content is a smart thing to do. But as the number of structured data items grows, all these loose pieces of code can end up on a big pile of Schema markup on your site’s pages. Yoast SEO helps you prevent creating a big and unorganized pile of code. For every page or post, our plugin creates a neat structured data graph. In this graph, it connects the loose pieces of structured data with each other. When the pieces are connected, a search engine can understand, for instance, that a post is written by author X, working for organization Y, selling brand Z.
You can even build full how-to articles and FAQ pages using the free structured data content blocks in Yoast SEO!
A structured data graph: Yoast SEO connects blobs of Schema markup in one single graph, so search engines understand the bigger picture.
Canonicals were introduced as an answer to duplicate content quite some time ago. So, what’s duplicate content? Duplicate content means you’ve published content that is the same or very similar to other content on your site. In other words: it’s available on multiple URLs. This confuses search engines. They start to wonder which URL they should show in the search results.
Duplicate content can exist without you being aware of it. In an online store, for instance, one product might belong to more than one category. If the category is included in the URL, the product page can be found on multiple URLs. Another example would be campaign tags. If you add these tags to your URLs when you share content on social or in your newsletter, it means the same page is available on a URL with and without a campaign tag. And there are more technical causes for duplicate content such as these.
The solution for this type of duplicate content issues is a self-referencing canonical. A canonical URL lets you say to search engines: “Of all the options available for this URL, this URL is the one you should show in the search results”. You can do so by adding a rel=canonical tag on a page, pointing to the page that you’d like to rank. In this case, you’d need the canonical tag to point to the URL of the original page.
How does the Yoast SEO plugin help?
Should you go through all your posts now and add the canonical tag? Not if you’re using Yoast SEO. The plugin does this for you, everywhere on your site: single posts and pages, homepages, category archives, tag archives, date archives, author archives, etc. If you’re not really a techy person, the canonical isn’t easy to wrap your head around. Or perhaps you simply don’t have the time to focus on it. Why not let Yoast SEO take care of it? Then you can move on to the more exciting stuff!
Another hidden feature in Yoast SEO is rel=next / rel=prev. It’s a method of telling search engines that certain pages belong to an archive: a so-called paginated archive. A rel=next / prev tag in the header of your site lets search engines know what the previous and the next page in that archive is. No one other than people looking at the source code of your site and search engines see this piece of code.
Not so long ago, Google announced that it isn’t using rel=next/prev anymore. Does this mean we should do away with this feature? Certainly not! Bing and other search engines still use it, so Yoast SEO will keep on adding rel=next / prev tags to paginated archives.
If you have a WordPress site, you most likely have a login link and a registration link for the backend of your site. But the login or registration page of your backend are places that visitors and search engines don’t ever need to be.
Therefore, Yoast SEO tells search engines not to follow links for login and registration pages. Yoast SEO makes sure that search engines will never follow these links. It’s a tiny tweak, but it saves a lot of unneeded Google action.
5. Noindex your internal search results
This hidden feature is based on Google’s Search Essentials documentation. Google wants to prevent users from going from a search result in Google to a search result page on a website. Google, justly, considers that bad user experience.
You can tell search engines not to include a certain page in their search results by adding a noindex tag to a page. Because of Google’s guidelines, Yoast SEO tells search engines that they shouldn’t display your internal search results pages in their search results with a noindex tag. But the links on these pages can still be followed and counted, which is better for your SEO. The plugin tells them not to show these pages in the search results; the links on these pages can still be followed and counted which is better for SEO.
This last hidden feature is quite a technical one. In short, it prevents your site from creating lots of URLs with no added value. WordPress has a replytocom feature that lets you reply to comments without activating JavaScript in your browser. But this means that for every comment, it creates a separate URL with ?replytocom variables.
So what happens if you get a lot of comments? Search engines then have to index all those URLs, which is a waste of your crawl budget. Therefore we remove these variables by default.
But that’s not all..
Our plugin comes with loads of features and settings that will benefit the online visibility of your website. The free version of Yoast SEO already gives you access to a lot of features that will help you do well in the search results. Yoast SEO Premium gives you access to additional tools, like the internal linking suggestions or the redirect manager. This makes many SEO-related tasks much easier and saving you time.
Buy Yoast SEO Premium now!
Unlock powerful features and much more for your WordPress site with the Yoast SEO Premium plugin!
Bots have become an integral part of the digital space today. They help us order groceries, play music on our Slack channel, and pay our colleagues back for the delicious smoothies they bought us. Bots also populate the internet to carry out the functions they’re designed for. But what does this mean for website owners? And (perhaps more importantly) what does this mean for the environment? Read on to find out what you need to know about bot traffic and why you should care about it!
Let’s start with the basics: A bot is a software application designed to perform automated tasks over the internet. Bots can imitate or even replace the behavior of a real user. They’re very good at executing repetitive and mundane tasks. They’re also swift and efficient, which makes them a perfect choice if you need to do something on a large scale.
What is bot traffic?
Bot traffic refers to any non-human traffic to a website or app. Which is a very normal thing on the internet. If you own a website, it’s very likely that you’ve been visited by a bot. As a matter of fact, bot traffic accounts for almost 30% of all internet traffic at the moment.
Is bot traffic bad?
You’ve probably heard that bot traffic is bad for your site. And in many cases, that’s true. But there are good and legitimate bots too. It depends on the purpose of the bots and the intention of their creators. Some bots are essential for operating digital services like search engines or personal assistants. However, some bots want to brute-force their way into your website and steal sensitive information. So, which bots are ‘good’ and which ones are ‘bad’? Let’s dive a bit deeper into this topic.
The ‘good’ bots
‘Good’ bots perform tasks that do not cause harm to your website or server. They announce themselves and let you know what they do on your website. The most popular ‘good’ bots are search engine crawlers. Without crawlers visiting your website to discover content, search engines have no way to serve you information when you’re searching for something. So when we talk about ‘good’ bot traffic, we’re talking about these bots.
Other than search engine crawlers, some other good internet bots include:
SEO crawlers: If you’re in the SEO space, you’ve probably used tools like Semrush or Ahrefs to do keyword research or gain insight into competitors. For those tools to serve you information, they also need to send out bots to crawl the web and gather data.
Commercial bots: Commercial companies send these bots to crawl the web to gather information. For instance, research companies use them to monitor news on the market; ad networks need them to monitor and optimize display ads; ‘coupon’ websites gather discount codes and sales programs to serve users on their websites.
Site-monitoring bots: They help you monitor your website’s uptime and other metrics. They periodically check and report data, such as your server status and uptime duration. This allows you to take action when something’s wrong with your site.
Feed/aggregator bots: They collect and combine newsworthy content to deliver to your site visitors or email subscribers.
The ‘bad’ bots
‘Bad’ bots are created with malicious intentions in mind. You’ve probably seen spam bots that spam your website with nonsense comments, irrelevant backlinks, and atrocious advertisements. And maybe you’ve also heard of bots that take people’s spots in online raffles, or bots that buy out the good seats in concerts.
It’s due to these malicious bots that bot traffic gets a bad reputation, and rightly so. Unfortunately, a significant amount of bad bots populate the internet nowadays.
Here are some bots you don’t want on your site:
Email scrapers: They harvest email addresses and send malicious emails to those contacts.
Comment spam bots: Spam your website with comments and links that redirect people to a malicious website. In many cases, they spam your website to advertise or to try to get backlinks to their sites.
Scrapers bots: These bots come to your website and download everything they can find. That can include your text, images, HTML files, and even videos. Bot operators will then re-use your content without permission.
Bots for credential stuffing or brute force attacks: These bots will try to gain access to your website to steal sensitive information. They do this by trying to log in like a real user.
Botnet, zombie computers: They are networks of infected devices used to perform DDoS attacks. DDoS stands for distributed denial-of-service. During a DDoS attack, the attacker uses such a network of devices to flood a website with bot traffic. This overwhelms your web server with requests, resulting in a slow or unusable website.
Inventory andticket bots: They go to websites to buy up tickets for entertainment events or to bulk purchase newly-released products. Brokers use them to resell tickets or products at a higher price to make profits.
Why you should care about bot traffic
Now that you’ve got some knowledge about bot traffic, let’s talk about why you should care.
For your website performance
Malicious bot traffic strains your web server and sometimes even overloads it. These bots take up your server bandwidth with their requests, making your website slow or utterly inaccessible in case of a DDoS attack. In the meantime, you might have lost traffic and sales to other competitors.
In addition, malicious bots disguise themselves as regular human traffic, so they might not be visible when you check your website statistics. The result? You might see random spikes in traffic but don’t understand why. Or, you might be confused as to why you receive traffic but no conversion. As you can imagine, this can potentially hurt your business decisions because you don’t have the correct data.
For your site security
Malicious bots are also bad for your site’s security. They will try to brute force their way into your website using various username/password combinations, or seek out weak entry points and report to their operators. If you have security vulnerabilities, these malicious players might even attempt to install viruses on your website and spread those to your users. And if you own an online store, you will have to manage sensitive information like credit card details that hackers would love to steal.
For the environment
Did you know that bot traffic affects the environment? When a bot visits your site, it makes an HTTP request to your server asking for information. Your server needs to respond, then return the necessary information. Whenever this happens, your server must spend a small amount of energy to complete the request. Now, consider how many bots there are on the internet. You can probably imagine that the amount of energy spent on bot traffic is enormous!
In this sense, it doesn’t matter if a good or bad bot visits your site. The process is still the same. Both use energy to perform their tasks, and both have consequences on the environment.
Even though search engines are an essential part of the internet, they’re guilty of being wasteful too. They can visit your site too many times, and not even pick up the right changes. We recommend checking your server log to see how many times crawlers and bots visit your site. Additionally, there’s a crawl stats report in Google Search Console that also tells you how many times Google crawls your site. You might be surprised by some numbers there.
A small case study from Yoast
Let’s take Yoast, for instance. On any given day, Google crawlers can visit our website 10,000 times. It might seem reasonable to visit us a lot, but they only crawl 4,500 unique URLs. That means energy was used on crawling the duplicate URLs over and over. Even though we regularly publish and update our website content, we probably don’t need all those crawls. These crawls aren’t just for pages; crawlers also go through our images, CSS, JavaScript, etc.
But that’s not all. Google bots aren’t the only ones visiting us. There are bots from other search engines, digital services, and even bad bots too. Such unnecessary bot traffic strains our website server and wastes energy that could otherwise be used for other valuable activities.
Statistic on the crawl behaviors of Google crawlers on Yoast.com in a day
What can you do against ‘bad’ bots?
You can try to detect bad bots and block them from entering your site. This will save you a lot of bandwidth and reduce strain on your server, which in turn helps to save energy. The most basic way to do this is to block an individual or an entire range of IP addresses. You should block an IP address if you identify irregular traffic from that source. This approach works, but it’s labor-intensive and time-consuming.
Alternatively, you can use a bot management solution from providers like Cloudflare. These companies have an extensive database of good and bad bots. They also use AI and machine learning to detect malicious bots, and block them before they can cause harm to your site.
Security plugins
Additionally, you should install a security plugin if you’re running a WordPress website. Some of the more popular security plugins (like Sucuri Security or Wordfence) are maintained by companies that employ security researchers who monitor and patch issues. Some security plugins automatically block specific ‘bad’ bots for you. Others let you see where unusual traffic comes from, then let you decide how to deal with that traffic.
What about the ‘good’ bots?
As we mentioned earlier, ‘good’ bots are good because they’re essential and transparent in what they do. But they can still consume a lot of energy. Not to mention, these bots might not even be helpful for you. Even though what they do is considered ‘good’, they could still be disadvantageous to your website and the environment. So, what can you do for the good bots?
1. Block them if they’re not useful
You have to decide whether or not you want these ‘good’ bots to crawl your site. Does them crawling your site benefit you? More specifically: Does them crawling your site benefit you more than the cost to your servers, their servers, and the environment?
Let’s take search engine bots, for instance. Google is not the only search engine out there. It’s most likely that crawlers from other search engines have visited you as well. What if a search engine has crawled your site 500 times today, while only bringing you ten visitors? Is that still useful? If this is the case, you should consider blocking them, since you don’t get much value from this search engine anyway.
2. Limit the crawl rate
If bots support the crawl-delay in robots.txt, you should try to limit their crawl rate. This way, they won’t come back every 20 seconds to crawl the same links over and over. Because let’s be honest, you probably don’t update your website’s content 100 times on any given day. Even if you have a larger website.
You should play with the crawl rate, and monitor its effect on your website. Start with a slight delay, then increase the number when you’re sure it doesn’t have negative consequences. Plus, you can assign a specific crawl delay rate for crawlers from different sources. Unfortunately, Google doesn’t support craw delay, so you can’t use this for Google bots.
3. Help them crawl more efficiently
There are a lot of places on your website where crawlers have no business coming. Your internal search results, for instance. That’s why you should block their access via robots.txt. This not only saves energy, but also helps to optimize your crawl budget.
Next, you can help bots crawl your site better by removing unnecessary links that your CMS and plugins automatically create. For instance, WordPress automatically creates an RSS feed for your website comments. This RSS feed has a link, but hardly anybody looks at it anyway, especially if you don’t have a lot of comments. Therefore, the existence of this RSS feed might not bring you any value. It just creates another link for crawlers to crawl repeatedly, wasting energy in the process.
Optimize your website crawl with Yoast SEO
Yoast SEO has a useful and sustainable new setting: the crawl optimization settings! With over 20 available toggles, you’ll be able to turn off the unnecessary things that WordPress automatically adds to your site. You can see the crawl settings as a way to easily clean up your site of unwanted overhead. For example, you have the option to clean up the internal site search of your site to prevent SEO spam attacks!
Even if you’ve only started using the crawl optimization settings today, you’re already helping the environment!
The robots.txt file is one of the main ways of telling a search engine where it can and can’t go on your website. All major search engines support its basic functionality, but some respond to additional rules, which can be helpful too. This guide covers all the ways to use robots.txt on your website.
Warning!
Any mistakes you make in your robots.txt can seriously harm your site, so read and understand this article before diving in.
The robots.txt file is one of a number of crawl directives. We have guides on all of them and you’ll find them here.
A robots.txt file is a plain text document located in a website’s root directory, serving as a set of instructions to search engine bots. Also called the Robots Exclusion Protocol, the robots.txt file results from a consensus among early search engine developers. It’s not an official standard set by any standards organization, although all major search engines adhere to it.
Robots.txt specifies which pages or sections should be crawled and indexed and which should be ignored. This file helps website owners control the behavior of search engine crawlers, allowing them to manage access, limit indexing to specific areas, and regulate crawling rate. While it’s a public document, compliance with its directives is voluntary, but it is a powerful tool for guiding search engine bots and influencing the indexing process.
A basic robots.txt file might look something like this:
Search engines typically cache the contents of the robots.txt so that they don’t need to keep downloading it, but will usually refresh it several times a day. That means that changes to instructions are typically reflected fairly quickly.
Search engines discover and index the web by crawling pages. As they crawl, they discover and follow links. This takes them from site A to site B to site C, and so on. But before a search engine visits any page on a domain it hasn’t encountered, it will open that domain’s robots.txt file. That lets them know which URLs on that site they’re allowed to visit (and which ones they’re not).
The robots.txt file should always be at the root of your domain. So if your domain is www.example.com, the crawler should find it at https://www.example.com/robots.txt.
It’s also essential that your robots.txt file is called robots.txt. The name is case-sensitive, so get that right, or it won’t work.
Pros and cons of using robots.txt
Pro: managing crawl budget
It’s generally understood that a search spider arrives at a website with a pre-determined “allowance” for how many pages it will crawl (or how much resource/time it’ll spend, based on a site’s authority/size/reputation, and how efficiently the server responds). SEOs call this the crawl budget.
If you think your website has problems with crawl budget, blocking search engines from ‘wasting’ energy on unimportant parts of your site might mean focusing instead on the sections that matter. Use the crawl cleanup settings in Yoast SEO to help Google crawls what matters.
It can sometimes be beneficial to block the search engines from crawling problematic sections of your site, especially on sites where a lot of SEO clean-up has to be done. Once you’ve tidied things up, you can let them back in.
A note on blocking query parameters
One situation where crawl budget is crucial is when your site uses a lot of query string parameters to filter or sort lists. Let’s say you have ten different query parameters, each with different values that can be used in any combination (like t-shirts in multiple colors and sizes). This leads to many possible valid URLs, all of which might get crawled. Blocking query parameters from being crawled will help ensure the search engine only spiders your site’s main URLs and won’t go into the enormous spider trap you’d otherwise create.
Con: not removing a page from search results
Even though you can use the robots.txt file to tell a crawler where it can’t go on your site, you can’t use it to say to a search engine which URLs not to show in the search results – in other words, blocking it won’t stop it from being indexed. If the search engine finds enough links to that URL, it will include it; it will just not know what’s on that page. So your result will look like this:
Use a meta robotsnoindex tag if you want to reliably block a page from appearing in the search results. That means that to find the noindex tag, the search engine has to be able to access that page, so don’t block it with robots.txt.
Noindex directives
It used to be possible to add ‘noindex’ directives in your robots.txt, to remove URLs from Google’s search results, and to avoid these ‘fragments’ showing up. This is no longer supported (and technically, never was).
Con: not spreading link value
If a search engine can’t crawl a page, it can’t spread the link value across the links on that page. It’s a dead-end when you’ve blocked a page in robots.txt. Any link value which might have flowed to (and through) that page is lost.
Robots.txt syntax
WordPress robots.txt
We have an article on how best to setup your robots.txt for WordPress. Don’t forget you can edit your site’s robots.txt file in the Yoast SEO Tools → File editor section.
A robots.txt file consists of one or more blocks of directives, each starting with a user-agent line. The “user-agent” is the name of the specific spider it addresses. You can have one block for all search engines, using a wildcard for the user-agent, or particular blocks for particular search engines. A search engine spider will always pick the block that best matches its name.
These blocks look like this (don’t be scared, we’ll explain below):
User-agent: * Disallow: /
User-agent: Googlebot Disallow:
User-agent: bingbot Disallow: /not-for-bing/
Directives like Allow and Disallow should not be case-sensitive, so it’s up to you to write them in lowercase or capitalize them. The values are case-sensitive, so /photo/ is not the same as /Photo/. We like capitalizing directives because it makes the file easier (for humans) to read.
The user-agent directive
The first bit of every block of directives is the user-agent, which identifies a specific spider. The user-agent field matches with that specific spider’s (usually longer) user-agent, so, for instance, the most common spider from Google has the following user-agent:
If you want to tell this crawler what to do, a relatively simple User-agent: Googlebot line will do the trick.
Most search engines have multiple spiders. They will use a specific spider for their normal index, ad programs, images, videos, etc.
Search engines always choose the most specific block of directives they can find. Say you have three sets of directives: one for *, one for Googlebot and one for Googlebot-News. If a bot comes by whose user-agent is Googlebot-Video, it will follow the Googlebot restrictions. A bot with the user-agent Googlebot-News would use more specific Googlebot-News directives.
The most common user agents for search engine spiders
Here’s a list of the user-agents you can use in your robots.txt file to match the most commonly used search engines:
Search engine
Field
User-agent
Baidu
General
baiduspider
Baidu
Images
baiduspider-image
Baidu
Mobile
baiduspider-mobile
Baidu
News
baiduspider-news
Baidu
Video
baiduspider-video
Bing
General
bingbot
Bing
General
msnbot
Bing
Images & Video
msnbot-media
Bing
Ads
adidxbot
Google
General
Googlebot
Google
Images
Googlebot-Image
Google
Mobile
Googlebot-Mobile
Google
News
Googlebot-News
Google
Video
Googlebot-Video
Google
Ecommerce
Storebot-Google
Google
AdSense
Mediapartners-Google
Google
AdWords
AdsBot-Google
Yahoo!
General
slurp
Yandex
General
yandex
The disallow directive
The second line in any block of directives is the Disallow line. You can have one or more of these lines, specifying which parts of the site the specified spider can’t access. An empty Disallow line means you’re not disallowing anything so that a spider can access all sections of your site.
The example below would block all search engines that “listen” to robots.txt from crawling your site.
User-agent: * Disallow: /
The example below would allow all search engines to crawl your site by dropping a single character.
User-agent: * Disallow:
The example below would block Google from crawling the Photo directory on your site – and everything in it.
User-agent: googlebot Disallow: /Photo
This means all the subdirectories of the /Photo directory would also not be spidered. It would not block Google from crawling the /photo directory, as these lines are case-sensitive.
This would also block Google from accessing URLs containing /Photo, such as /Photography/.
How to use wildcards/regular expressions
“Officially,” the robots.txt standard doesn’t support regular expressions or wildcards; however, all major search engines understand it. This means you can use lines like this to block groups of files:
In the example above, * is expanded to whatever filename it matches. Note that the rest of the line is still case-sensitive, so the second line above will not block a file called /copyrighted-images/example.JPG from being crawled.
Some search engines, like Google, allow for more complicated regular expressions but be aware that other search engines might not understand this logic. The most useful feature this adds is the $, which indicates the end of a URL. In the following example, you can see what this does:
Disallow: /*.php$
This means /index.php can’t be indexed, but /index.php?p=1could be. Of course, this is only useful in very specific circumstances and pretty dangerous: it’s easy to unblock things you didn’t want to.
Non-standard robots.txt crawl directives
In addition to the commonly used Disallow and User-agent directives, there are a few other crawl directives available for robots.txt files. However, it’s important to note that not all search engine crawlers support these directives, so it’s essential to understand their limitations and considerations before implementing them.
The allow directive
While not in the original “specification,” there was talk of an allow directive very early. Most search engines seem to understand it, and it allows for simple and very readable directives like this:
The only other way of achieving the same result without an allow directive, would have been to specifically disallow every single file in the wp-admin folder.
The crawl-delay directive
Crawl-delay is an unofficial addition to the standard, and few search engines adhere to it. At least Google and Yandex don’t use it, with Bing being unclear. In theory, as crawlers can be pretty crawl-hungry, you could try the crawl-delay direction to slow them down.
A line like the one below would instruct those search engines to change how frequently they’ll request pages on your site.
crawl-delay: 10
Do take care when using the crawl-delay directive. By setting a crawl delay of ten seconds, you only allow these search engines to access 8,640 pages a day. This might seem plenty for a small site, but it isn’t much for large sites. On the other hand, if you get next to no traffic from these search engines, it might be a good way to save some bandwidth.
The sitemap directive for XML Sitemaps
Using the sitemap directive, you can tell search engines – Bing, Yandex, and Google – where to find your XML sitemap. You can, of course, submit your XML sitemaps to each search engine using their webmaster tools. We strongly recommend you do so because webmaster tools will give you a ton of information about your site. If you don’t want to do that, adding a sitemap line to your robots.txt is a quick alternative. Yoast SEO automatically adds a link to your sitemap if you let it generate a robots.txt file. On an existing robots.txt file, you can add the rule by hand via the file editor in the Tools section.
Sitemap: https://www.example.com/my-sitemap.xml
Don’t block CSS and JS files in robots.txt
Since 2015, Google Search Console has warned site owners not to block CSS and JS files. We’ve told you the same thing for ages: don’t block CSS and JS files in your robots.txt. Let us explain why you shouldn’t block these specific files from Googlebot.
By blocking CSS and JavaScript files, you’re preventing Google from checking if your website works correctly. If you block CSS and JavaScript files in yourrobots.txt file, Google can’t render your website as intended. Now, Google can’t understand your website, which might result in lower rankings. Moreover, even tools like Ahrefs render web pages and execute JavaScript. So, don’t block JavaScript if you want your favorite SEO tools to work.
This aligns perfectly with the general assumption that Google has become more “human.” Google wants to see your website like a human visitor would, so it can distinguish the main elements from the extras. Google wants to know if JavaScript enhances the user experience or ruins it.
Test and fix in Google Search Console
Google helps you find and fix issues with your robots.txt, for instance, in the Page Indexing section in Google Search Console. Select the Blocked by robots.txt option:
Check Search Console to see which URLs are blocked by your robots.txt
Unblocking blocked resources comes down to changing your robots.txt file. You need to set that file up so that it doesn’t disallow Google to access your site’s CSS and JavaScript files anymore. If you’re on WordPress and use Yoast SEO, you can do this directly with our Yoast SEO plugin.
Validate your robots.txt
Various tools can help you validate your robots.txt, but we always prefer to go to the source when validating crawl directives. Google has a robots.txt testing tool in its Google Search Console (under the ‘Old version’ menu), and we’d highly recommend using that:
Testing a robots.txt file in Google Search Console
Be sure to test your changes thoroughly before you put them live! You wouldn’t be the first to accidentally use robots.txt to block your entire site and slip into search engine oblivion!
Behind the scenes of a robots.txt parser
In 2019, Google announced they were making their robots.txt parser open source. If you want to get into the nuts and bolts, you can see how their code works (and even use it yourself or propose modifications).
Do you want to outrank your competition? Then basic knowledge of technical SEO is a must. Of course, you also need to create great and relevant content for your site. Luckily, the Yoast SEO plugin takes care of (almost) everything on your WordPress site. Still, it’s good to understand one of the most important concepts of technical SEO: crawlability.
What is the crawler again?
A search engine like Google consists of three things: a crawler, an index, and an algorithm. A crawler follows the links on the web. It does this 24/7! Once a crawler comes to a website, it saves the HTML version in a gigantic database called the index. This index is updated every time the crawler comes around your website, and finds a new or revised version of it. Depending on how important Google deems your site and the number of changes you make on your website, the crawler comes around more or less often.
Fun fact: A crawler is also called a robot, a bot, or a spider! And Google’s crawler is sometimes referred to as Googlebot.
Crawlability has to do with the possibilities Google has to crawl your website. Luckily, you can block crawlers on your site. If your website or a page on your website is blocked, you’re saying to Google’s crawler: “Do not come here.” As a result, your site or the respective page won’t turn up in the search results. At least, in most cases.
So how do you block crawlers? There are a few things that could prevent Google from crawling (or indexing) your website:
If your robots.txt file blocks the crawler, Google will not come to your website or specific web page.
Before crawling your website, the crawler will take a look at the HTTP header of your page. This HTTP header contains a status code. If this status code says that a page doesn’t exist, Google won’t crawl your website. Want to know more? We’ll explain all about this HTTP header tip in the module of our Technical SEO training!
If the robots meta tag on a specific page blocks the search engine from indexing that page, Google will crawl that page, but won’t add it to its index.
How crawlers impact the environment
Yes, you read that right. Crawlers have a substantial impact on the environment. Here’s how: Crawlers can come to your site multiple times a day. Why? They want to discover new content, or check if there are any new updates. And every time they visit our site, they will crawleverything that looks like a URL to them. This means a URL is often crawled multiple times per day.
This is unnecessary, because you’re unlikely to make multiple changes on a URL on any given day. Not to mention, almost every CMS output URLs that don’t make sense that crawlers can safely skip. But instead of skipping these URLs, crawlers will crawl them, again and again, every time they come across one. All this unnecessary crawling takes up tons of energy resources which is harmful for our planet.
Improve your site’s crawlability with Yoast SEO Premium
To ensure you’re not wasting energy, it’s important to stay on top of your site’s crawlability settings. Luckily, you don’t have to do all the work yourself. Using tools such as Yoast SEO Premium will make it easier for you!
So how does it work? We have a crawl settings feature that removes unnecessary URLs, feeds, and assets from your website. This will make crawlers crawl your website more efficiently. Don’t worry, you’re still in control! Because the feature also allows you to decide per type of asset whether you want to actually remove the URL or not. If you want to know more, we’ll explain all about the crawl settings here.
Want to learn more about crawlability?
Although crawlability is a basic part of technical SEO (it has to do with all the things that enable Google to index your site), it’s already pretty advanced stuff for most people. Still, it’s important that you understand what crawlability is. You might be blocking – perhaps even without knowing! – crawlers from your site, which means you’ll never rank high in Google. So, if you’re serious about SEO, crawlability should matter to you.
An easy way to learn is by doing our technical SEO trainings. These SEO courses will teach you how to detect technical SEO issues and solve them (with our Yoast SEO plugin). We also have a training dedicated to crawlability and indexability! Good to know for Premium users: Yoast SEO Academy is already included at no extra cost in your Premium subscription!
Today, we’re very excited to be releasing Yoast SEO 20.4. With this release, we’re bringing our crawl optimization feature to Yoast SEO Free. With this feature, you can improve your SEO and reduce your carbon footprint with just a few clicks. This blog post will tell you about this feature and why we’ve brought it to Yoast SEO.
Before we explain this Yoast SEO feature, it’s good to start with a quick reminder of what crawling is. Search engines like Google or Bing use crawlers, also known as bots, to find your website, read it and save its content to their index. They go around the internet 24/7 to ensure the content saved in its index is as up-to-date as possible. Depending on the number of changes you make on your website and how important search engines deem your site, the crawler comes around more or less often.
That’s nice, but did you know crawlers do an incredible amount of unnecessary crawling?
Let’s reduce unnecessary crawling
As you can imagine, search engine crawlers don’t just visit your website but every single one they can find. The incredible number of websites out there keeps them quite busy. In fact, bots are responsible for around 30% of all web traffic. This uses lots of electricity, and a lot of that crawling isn’t necessary at all. This is where our crawl optimization feature comes in. With just a few simple changes, you can tell search engines like Google which pages or website elements they can skip — making it easier to visit the right pages on your website while reducing the energy wasted on unnecessary bot traffic.
The carbon footprint of your website
You might be wondering why we want to help you reduce the energy consumption of your website. Does it make that much of a difference? The answer is yes! Regardless of the size of your website, the fact is that your website has a carbon footprint. Internet usage and digital technology are two massive players in pollution and energy consumption.
Every interaction on your website results in electricity being used. For instance, when someone visits your website, their browser needs to make an HTTP request to your server, and that server needs to return the necessary information. On the other side, the browser also needs the power to process data and present the page to the visitor. The energy needed to complete these requests might be small, but it adds up when you consider all the interactions on your website. Similar to when a visitor lands on your site, crawlers or bots also make these requests to your server that cost energy. Considering the amount of bot traffic (30% of web traffic), reducing the number of irrelevant pages and other resources crawled by search engines is worth it.
Take control of what’s being crawled
The crawl optimization feature in Yoast SEO lets you turn off crawling for certain types of URLs, scripts, and metadata that WordPress automatically adds. This makes it possible to improve your SEO and reduce your carbon footprint with just a few clicks.
Check out this fun animation to get an idea of what this feature can do for your website:
The crawl optimization feature was already part of Yoast SEO Premium, but today we’re also bringing it to the free version of our plugin. We do this to make as much of an impact as possible. There are over 13 million Yoast SEO users, so if everyone’s website crawling is optimized, we can have an enormous impact!
How to use the crawl optimization feature
How do you get started with crawl optimization for your website? Just go to Yoast SEO > Settings > Advanced > Crawl optimization. Here you will find an overview of all the types of metadata, content formats, etc., that you can tell search engines not to crawl. You can use the toggles on the right to enable crawl optimization.
Screenshot of the Crawl optimization section in Yoast SEO settings
The crawl optimization settings in Yoast SEO 20.4 allow you to:
Remove unwanted metadata: WordPress adds a lot of links and content to your site’s and HTTP headers. For most websites, you can safely disable these, making your site faster and more efficient.
Disable unwanted content formats: For every post, page, and category on your site, WordPress creates multiple types of feeds; content formats designed to be consumed by crawlers and machines. But most of these are outdated, and many websites won’t need to support them. Disable the formats you’re not actively using to improve your site’s efficiency.
Remove unused resources: WordPress loads countless resources, some of which your site might not need. Removing these can speed up your site and save energy if you’re not using them.
Internal site search cleanup: Your internal site search can create many confusing URLs for search engines and can even be used by SEO spammers to attack your site. This feature identifies some common spam patterns and stops them in their tracks. Most sites will benefit from experimenting with these optimizations, even if your theme doesn’t have a search feature.
Advanced: URL cleanup: Users and search engines may often request your URLs using query parameters, like ?color=red. These can help track, filter, and power advanced functionality – but they come at a performance and SEO ‘cost.’ Sites that don’t rely on URL parameters might benefit from these options. Important note: These are expert features, so ensure you know what you’re doing before removing the parameters.
That’s it for now. Make sure to update to Yoast SEO 20.4 and optimize your website’s crawling immediately! It’s not only better for your website, your site visitors, and search engines. It also has a positive impact on our environment. Especially when you realize how many we are, if all 13 million of us optimize the crawling on our website, we can reduce the amount of energy used by a ridiculous amount. So let’s start right now!
HTTP status codes, like 404, 301, and 500, might not mean much to a regular visitor, but they are incredibly important for SEO. Not only that, search engine spiders, like Googlebot, use these to determine the health of a site. These status codes offer a way of seeing what happens between the browser and the server. Several of these codes indicate an error, for instance, that the requested content can’t be found, while others simply suggest a successful delivery of the requested material. In this article, we’re taking a closer look at the most important HTTP header codes and what they mean for SEO.
What are HTTP status codes, and why do you see them?
An HTTP status code is a three-digit message the server sends when a request made by a browser can or cannot be fulfilled. According to the official W3C specs, there are dozens of status codes, many of which you’re unlikely to come across. If you need a handy overview of status codes, including their code references, you can find one on HTTPstatuses.com.
To fully understand these codes, you must know how a browser gets a web page. Every website visit starts by typing in the URL of a site or entering a search term in a search engine. The browser requests the site’s IP address for the associated web page. The server responds with a status code embedded in the HTTP header, telling the browser the result of the request. When everything is fine, an HTTP 200 header code is sent back to the browser in conjunction with the website’s content.
However, it is also possible that there’s something wrong with the requested content or server. It could be that the page is not found, which gives back a 404 error page, or there might be a temporary, technical issue with the server, resulting in a 500 Internal Server Error. These HTTP status codes are an important tool for evaluating the health of the site and its server. If a site regularly sends improper HTTP header codes to a search engine indexing its contents, it might cause problems that will hurt its rankings.
Here’s part of the HTTP header for a web page, with a 200 OK message:
There are five ranges of HTTP status codes, defining different aspects of the transaction process between the client and the server. Below you’ll find the five ranges and their main goal:
1xx – Informational
2xx – Success
3xx – Redirection
4xx – Client error
5xx – Server error
If you ever try to brew coffee in a teapot, your teapot will probably send you the status message 418: I’m a teapot.
Most important HTTP status codes for SEO
As we’ve said, the list of codes is long, but a few are especially important for SEOs and anyone working on their own site. We’ll do a quick rundown of these below:
200: OK / Success
This is how it probably should be; a client asks the server for content and the server replies with a 200 success message and the content the client needs. The server and the client are happy — and the visitor, of course. All messages in 2xx mean some sort of success.
301: Moved Permanently
A 301 HTTP header is used when the requested URL is permanently moved to a new location. As you are working on your site, you will often use this, because you regularly need to make a 301 redirect to direct an old URL to a new one. If you don’t, users will see a 404 error page if they try to open the old URL and that’s not something you want. Using a 301 will make sure that the link value of the old URL transfers to the new URL.
A 302 means that the target destination has been found, but it lives in a different location. However, it is a rather ambiguous status code because it doesn’t tell if this is a temporary situation. Use a 302 redirect only if you want to temporarily redirect a URL to a different source and are sure you will use the same URL again.
Since you tell search engines that the URL will be used again, none of the link value is transferred to the new URL, so you shouldn’t use a 302 when moving your domain or making big changes to your site structure, for instance. Also, when you leave 302 redirects in place for a long time, search engines can treat these 302 redirects as 301 redirects.
304: Not Modified
A 304 redirect is a type of HTTP response code that indicates that the requested resource has not been modified since the last time it was accessed by the client. It means that the server does not need to send the resource again but instead tells the client to use a cached version. The 304 response code is a way to save crawl budget for large websites. This is because Google’s crawler won’t recrawl unchanged pages and can instead focus on crawling new and updated pages.
307: Temporary Redirect
The 307 code replaces the 302 in HTTP1.1 and could be seen as the only ‘true’ redirect. You can use a 307 redirect if you need to temporarily redirect a URL to a new one while keeping the original request method intact. A 307 looks a lot like a 302, except that it tells specifically that the URL has a temporary new location. The request can change over time, so the client has to keep using the original URL when making new requests.
403: Forbidden
A 403 tells the browser that the requested content is forbidden for the user. If they don’t have the correct login credentials, this content stays forbidden for that user.
404: Not Found
As one of the most visible status codes, the 404 HTTP header code is also one of the most important. When a server returns a 404 error, you know the content has not been found and is probably deleted. Try not to bother visitors with these messages, so fix these errors when you can. Use a redirect to send visitors from the old URL to a new article or page with related content.
Monitor these 404 messages in Google Search Console and keep them to the lowest amount possible. A lot of 404 errors might be seen by Google as a sign of bad maintenance. Which in return might influence your overall rankings. If your page is broken and should be gone from your site, a 410 sends a clearer signal to Google.
The result from a 410 status code is the same as a 404 since the content has not been found. However, with a 410, you tell search engines that you deleted the requested content. Thus, it’s much more specific than a 404. In a way, you order search engines to remove the URL from the index. Before permanently deleting something from your site, ask yourself if there is an equivalent of the page somewhere. If so, make a redirect. If not, maybe you shouldn’t delete it and just improve it.
The 451 HTTP status code shows that the requested content was deleted for legal reasons. If you received a takedown request or a judge ordered you to take specific content offline, you should use this code to tell search engines what happened to the page.
A 500 error is a generic message saying the server encountered an unexpected condition. This prevented it from fulfilling the request without determining what caused it. These errors could come from anywhere. Maybe your web host is doing something funny, or a script on your site is malfunctioning. Check your server’s logs to see where things go wrong.
503: Service Unavailable
A 503 HTTP status code is a server-side error that indicates that the server is temporarily unable to handle the request. This could be due to overloading, maintenance, or other issues on the server. A 503 status code can affect SEO if it lasts long, as it may signal to search engines that the site is unreliable or unavailable. To avoid negative SEO impacts, a 503 status code should be used only for short-term situations and provide crawlers with a clear message about when the site will return online. You can use the Retry-After value to ask crawlers to try again after a certain amount of time.
HTTP status codes are a big part of the lives of SEOs and that of search engine spiders. You’ll encounter them daily, and it’s key to understanding what the different status codes mean. For instance, if you delete a page from your site, you must know the difference between serving a 301 and a 410. They serve different goals and, therefore, have different results.
To understand the kinds of status codes your site generates, you should log into your Google Search Console. In the Indexing section, you’ll find the crawl errors Googlebot found over a certain time. These crawl errors must be fixed before your site can be indexed correctly.
Google Search Console lists errors it found on
Manage redirects with Yoast SEO Premium
We get it; working with these things is time-consuming and boring. However, creating redirects has never been easier if you use Yoast SEO Premium. Whenever you delete or move a post or page, the Redirect Manager in Yoast SEO asks you whether you want to redirect it. Just pick the correct option, and you’re good to go.
That’s all, folks
Make yourself familiar with these codes because you’ll see them pop up often. Knowing which redirects to use is an important skill that you’ll have to count on often when optimizing your site. One look at the crawl errors in Google Search Console should be enough to show you how much is going on under the hood.
As someone working on SEO, you must understand the importance of site speed. You must realize that fast sites equal happy users and happy search engines. PageSpeed Insights is an invaluable tool from Google that can help you optimize your website. It enables you to improve your rankings by giving you everything you need to boost the performance of your website. This guide will provide an overview of PageSpeed Insights. We’ll discuss what it is, how it works, and how you can optimize your website.
PageSpeed Insights (PSI) is a free tool offered by Google. It provides valuable insights into the performance and speed of your pages. The tool evaluates website performance and page experience based on several key metrics, including loading speed, resource utilization, images, and other media optimization. PSI works at a page level, so a good score for a page does not automatically equal a good score for your entire site.
The tool provides a score from 0 to 100, with 100 being the “fastest” and most “performant” web page. Note that getting a score of 100 is not something you need to aim for by any means. But your pages should pass the general Core Web Vitals Assessment. Remember those words; you’ll hear them often — more on this topic further down this article.
The PageSpeed Insights page after running a test for cnn.com
PageSpeed Insights provides data on how quickly your page loads, how many resources it uses, and how many requests it makes when loading. Then it also offers suggestions on how to make your pages better. With the help of this tool, you can identify areas of improvement. Use that knowledge to make the necessary changes to improve your website’s rankings.
In addition, PageSpeed Insights also checks your page on SEO and accessibility aspects and other best practices. In this article, we’ll focus on site speed and performance checks.
To understand PSI and how it fits into the page speed part of SEO, please read the following articles:
PageSpeed Insights runs tests and analyzes the HTML, JavaScript, and other resources that make up your website. The tool then provides a detailed report highlighting areas where the page the test was run on can be optimized. These suggestions include specific recommendations for improving your website’s speed and performance. The tool evaluates how your site functions on desktop and mobile devices, ensuring you optimize your website for all users.
An insight into PageSpeed Insight
Here’s a little bit more insight into how the test process works:
URL analysis: The first step in the PSI process is examining the URL being tested. This URL can be any online content with a valid URL, such as a product page, blog post, or other web-based material. Remember that a PSI test is specific to this URL and doesn’t automatically translate to your overall website performance.
Retrieving page content: Once the URL is submitted, PageSpeed Insights will retrieve the page’s content, including the HTML, CSS, JavaScript, images, and all other elements necessary to render the page.
Performance evaluation: After the page content is retrieved, PSI will conduct several tests to assess the page’s speed and efficiency. These tests analyze factors like page size and structure, resource quantity and size, and page load time.
Optimization recommendations: Based on the results of the performance tests, PageSpeed Insights provides suggestions for optimizing the page to improve it. These recommendations include reducing image size, simplifying CSS and JavaScript, enabling browser caching, and reducing the number of requests made to the server.
Scores: PSI will assign a score to the page based on its how it does. The score ranges from 0 to 100, with higher scores representing better performance. It calculates the score based on the test results and optimization recommendations.
It’s worth noting that PageSpeed Insights only assesses how a single page on your website performs. It does not take appearance or functionality into account. However, enhancing this often positively impacts how people perceive your site.
Google frequently updates PSI to provide the most current information and accurate results. By utilizing PageSpeed Insights, you can gain a deeper understanding of page performance. It helps you improve the user experience and increase your website’s overall speed and efficiency.
PSI metrics: lab data vs. field data
PageSpeed Insights offers a combination of laboratory and real-world data to help you comprehend and enhance your site’s functionality. The lab data represents a simulation of the website’s performance in a managed setting. The field data portrays actual metrics collected from real users visiting the website.
The lab data is obtained by conducting automated tests on the website through a standard testing environment. The tests assess load time, resource utilization, rendering speed, and more. Lab data provides a foundation for performance. It helps you spot problems impacting user experiences, like slow-loading recourses or unoptimized images. One of the weaknesses of lab data is that it’s for a specific point in time, and external factors like the weather, network stress, whether there’s a football game on, etc., can all affect real user experience. Your website needs to anticipate that.
The field data, on the other hand, delivers a more precise representation of how users encounter the website in the real world. This data is collected by monitoring users’ browsers and comes from the Chrome User Experience Report (CrUX). Field data offers valuable perspectives on user interaction with the website, such as which pages are slow or visually unstable. It also considers how factors like network connectivity and device type impact user experience.
Both laboratory and field data have advantages and limitations, making it crucial to use both to understand a website truly. Lab data provides a baseline and helps identify problems, while field data offers a more authentic view of user experience. By merging both data types, you can make informed choices on optimizing your website and enhancing the user experience.
Getting started with PageSpeed Insights
Starting with PageSpeed Insights is very easy. You can just enter the URL of the page you want to test into the tool and click the blue Analyze button. The tool will then run a series of tests on your page and generate a report. The report will provide a score for that specific URL’s performance and recommendations for improvement.
PSI only works at a page level. It looks at the one URL you enter to analyze — it is not a tool for side-wide analysis. Therefore, it’s good to test various pages of your site, as your homepage will perform differently from a blog post or a product page on your ecommerce site. Together, you’ll get a good sense of your site’s overall performance and where the bottlenecks are.
Enter your URL in the text field and hit the Analyze button
Getting the recommendation is easy, but implementing or fixing the issues is another story. The issues are prioritized, with the most pressing issues at the top. It also lists the opportunities it sees that help boost the scores of your page. The colored bar shows how many seconds you could save by implementing the improvements. Here, the red bars have the biggest impact on how your page performs.
Take action on the recommendations provided by the tool to improve your website’s performance. With PSI, you can start improving your website right away.
Key metrics evaluated by PageSpeed Insights
Some time ago, Google introduced the Page Experience algorithm update. With it came the Core Web Vitals, a set of metrics that measure the real-world user experience of a website. The Core Web Vitals include LCP, FID, and CLS. These metrics are crucial to determining how well a page scores on the test. This test aims to replicate a user’s experience loading and using a website.
Improving the Core Web Vitals of your website is essential for optimizing your website for both user experience and search engine rankings. You can ensure that your website loads quickly by improving your LCP, FID, CLS, and other key metrics. It provides users with a positive experience that will keep them on your website longer.
Some key metrics detailed
PageSpeed Insights evaluates the Page Experience of a site based on several key metrics, including:
First Contentful Paint (FCP): This metric measures the time it takes for the first content on a page to become visible to the user. A fast FCP helps ensure that users don’t have to wait long to see something on the screen after landing a page.
Largest Contentful Paint (LCP): This metric measures the time it takes for the largest content element on a page to become fully visible to the user. A fast LCP is crucial for a good user experience, as it indicates when the page will likely be fully loaded and ready to use.
Cumulative Layout Shift (CLS): This metric measures the stability of a page during loading and user interaction. A low CLS score indicates that the page’s content does not shift around as it loads, providing a better user experience. The CLS forms 25% of the ranking weight.
Interaction to Next Paint (INP): This metric measures how a page responds to user interaction by updating the screen. A fast INP helps provide a smooth user experience, ensuring that the page reacts quickly to user inputs.
First Input Delay (FID): This metric measures the time a page responds to the first user interaction, such as clicking a button or entering text. A fast FID helps to ensure that the page reacts quickly to user inputs, providing a better user experience.
Time to First Byte (TTFB): This metric measures the duration from when a browser requests a page until the first byte of data from the server arrives at the client. TTFB is a crucial metric for website performance and user experience as it indicates any bottlenecks in the server-side processing or if the server is taking too long to generate the content.
Total Blocking Time (TBT): This metric calculates the time during which a website’s primary content is prevented from being displayed to users. This metric is significant as it reflects the period during which users cannot interact with the website or access its content, affecting the user experience. The TBT determines 30% of the ranking score.
The most recent scoring weights provided by Lighthouse for PageSpeed Insights
PageSpeed Insights also has a Speed Index
In addition to the Core Web Vitals and these additional metrics, PageSpeed Insights also considers other factors when calculating scores. The Speed Index is a metric that gauges the perceived loading speed of a website. It offers a rating based on the speed at which the website’s content becomes visible during the loading process, from start to finish.
The Speed Index is a crucial metric to be aware of and monitor, as it demonstrates how quickly users can view and interact with the website’s content. A website that loads quickly can increase user engagement, reduce bounce rates, and improve conversions. Thus, monitoring the Speed Index score and taking action to improve it, if needed, is important to you.
PSI also makes the loading process insightful with screenshots of your site
The scores in PageSpeed Insights provide a general indication of how well your page does. You should not see this as the only factor determining the overall user experience. By addressing the issues identified by PageSpeed Insights, you can improve performance and provide a better user experience.
The overview screen with Web Vitals scores
Opening the results screen, you see six colored bars of The Core Web Vitals Assessment. PageSpeed Insights provides a snapshot of how well a site performs based on three important metrics, the Core Web Vitals and three experimental metrics. These metrics evaluate crucial aspects of the user experience, including loading speed, interactivity, and visual stability.
The Core Web Vitals Assessment section
In the Core Web Vitals Assessment section, you’ll find an easy-to-understand evaluation of how the website performs for each of these metrics based on data. Further down the page, you’ll find suggestions for enhancing the website for each metric to improve the user experience.
In the diagnose performance issues section, you’ll find a graphic representation of the loading process of your page. It also features scores for performance, accessibility, best practices, and SEO.
Keep an eye on these Core Web Vitals, and make a fast, responsive, and visually stable website. All of this is crucial for attracting and retaining users.
The Diagnostics screen lists improvements
The Diagnostics section in PageSpeed Insights provides in-depth insights and advice for enhancing your website. There is a lot to find here, but let’s look at a popular one as an example. One of its suggestions is to Reduce the impact of third-party code.
PSI shows which scripts block the loading of your page
Third-party code refers to scripts and widgets hosted on external servers and embedded into a website. These can significantly affect a site’s performance by slowing page load times and utilizing resources.
PageSpeed Insights helps you pinpoint the third-party scripts affecting your website’s speed. You can find this in the Reduce the impact of third-party code suggestion. It displays information about each third-party script’s size, type, and effect and recommends reducing its impact.
For instance, the tool may advise minimizing non-critical third-party scripts or optimizing script loading through lazy or asynchronous loading methods. Also, hosting third-party scripts on a content delivery network (CDN) to improve loading speed by reducing latency.
Following PageSpeed Insights’ suggestions in the Diagnostics section helps you minimize the impact of these issues.
How to improve your PageSpeed Insights score
Improving the performance of your site helps improve your PageSpeed Insights score. Below you’ll find a sampling of things you can do to make your site faster. We discuss this topic in more detail in our post on page speed.
Minimize the size of resources: The size of the resources on your website, such as images and other media, can significantly impact your website’s speed and performance. Minimizing the size of these resources can help reduce the time it takes for your website to load.
Optimize images:Optimizing images is one of the most effective ways to make your site faster. You can optimize images by compressing them, reducing their size, and converting them to a more optimized format.
Choose a better web host: The quality of your web host plays a critical role in the speed and reliability of your website. A good web host should provide fast and stable server resources, network connectivity, and a server location close to your target audience (with a CDN).
Use a content delivery network (CDN): A CDN can help distribute your website’s resources across multiple servers, reducing the load on your server and making your website perform better.
Minimize plugins: Plugins can slow down your website and negatively impact how it performs. Minimizing plugins and choosing lightweight, high-quality plugins can help improve your website’s speed and performance.
Use lazy loading: Lazy loading is a technique that only loads images and other resources when needed rather than loading them all at once. This can help reduce the time it takes for your website to load.
By following these tips, you can improve your PageSpeed Insights score. The result is a faster, performant website that provides a better user experience and ranks higher in search results.
Conclusion
PageSpeed Insights is an invaluable tool for everyone working on the SEO of their sites. The tool provides valuable insights into how your website performs and how fast it loads. Make sure that you understand the key metrics evaluated by PageSpeed Insights. After that, optimize your website accordingly. This way, you can improve the performance, resulting in a better user experience. In turn, that might lead to higher rankings in search engines!