Common Google Search Console 404 question:
I run a job search & career coaching platform, for which we have been actively using Google for Jobs for a couple of months. Today I have a question about the job detail URLs we use for Google for Jobs.
As you might imagine, jobs expire over time or are simply deleted because the positions are filled. Therefore, once this happens the job detail URL is closed. As a result, we send a 404 status code, add a no-index tag to the page, and even add a ValidThrough date in the past, to make sure these URLs are no longer served in G4J. Also, as a side note, these URLs are then updated in the Indexing API to notify Google that they are no longer valid.
However, this ongoing process is resulting in an increasing amount of 404 URLs as you can see in the screenshot from GSC.
My questions:
- Would it be better to change the status code from 404 to 410 to remove expired/deleted job pages more quickly?
- Since these jobs are not coming back, should we 301 redirect these URLs to the homepage or another useful page after a certain amount of time?
- Does having an increasing amount of 404 URLs affect my crawl budget?
Google Response:
noindex ing a 404 isn't necessary, 404 trumps the on-page meta.
1. You'd like to think that 410s were treated more seriously than a 404 (given it's a more 'deliberate' action), but in reality there is little evidence that this is actually the case. GSC doesn't even report them separately (410s are simply bundled with the 404 category).
https://www.youtube.com/watch?v=xp5Nf8ANfOw
https://www.youtube.com/watch?v=xp5Nf8ANfOw
2. You should never directly return deleted content to your homepage, that's very bad practice.
The best approach is to serve a custom 404 with a "click here to..." That way the bot gets 404 and your searcher not left hanging.
3. 404s don't hurt your ranking for sure.
It is however true that they can use, at least some, of your crawl budget. It's not major though as (as I alluded to at the top of this post), what's on the page (even if it exists) is irrelevant once the bot has detected the 404 in the header (or for real if the content isn't there).
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article