I recently found out that I had a soft 404 on one of my "tag" pages, which are pages containing all posts tagged with a specific word. The failure was on the page that had posts tagged with the keyword "error". It had the below error in the new search console:
So I was wondering, what is a soft 404? I knew what a normal 404 was, but I had never heard of a soft one. It turns out that it is when google deems that your page is a 404 page, but it does not return an actual 404 error code. Google states that:
Soft 404: The page request returns what we think is a soft 404 response. This means that it returns a user-friendly "not found" message without a corresponding 404 response code. We recommend returning a 404 response code for truly "not found" pages, or adding more information to the page to let us know that it is not a soft 404.
However the page that I was returning was giving a 200 return code and looked like the following. It however contained a total of 4 posts with only the first one being shown in below:
To me, the above does not flag anything "404ish". It could be assumed that it was regarded as a 500 error page - due to it having "error" as the title. So I did a little experimenting. I changed the title (of all my tag pages) to contain a little more text:
I was pretty confident that this would fix my issue. However it did not. Google would still not index my page and kept on giving me the soft 404 reason. I had read online that adding extra content or alternative content could resolve this issue. An example would be if you have a webshop and an item is out of stock you could get a soft 404. However this could often be resolved by adding "alternative" or "similar" products.
However I decided to do something entirely different. I decided to Remove content from this page. I took a look at the different posts this page linked to and what their titles were. I had a page named "IIS Error 500.19 - Internal Server Error - The requested page cannot be accessed". I thought this could seem like an error to search engines so I removed it and voilà. Suddenly google would accept indexing of my page.
Bottom line: Check your content. There must be something that the crawler deems to be looking like an error. For my page it was not looking like a 404, but rather a 500 internal server error. This was rather a "soft 500".
I hope this helps someone out there, if it did or you had a similar problem, let me know in the comments.