What is near duplicate content?
Near duplicate content refers to a piece of content from one page that has been placed on another page with slight changes or with a different boilerplate.
Near duplicates are one of the major SEO issues all websites face.
Understand what is causing your duplicates
OnCrawl’s near duplicates treemap lets you explore your clusters of pages with high similarity ratio.
- Spot if your duplicates are caused by an integration issue
- Learn what pieces of content are contributing to duplicates
- Check which zones of your website are more susceptible to duplication
SEOs need to deal with duplication
The bigger your website, the more likely it is that near duplicates are hurting your SEO. The main question is to determine what level of similarity is acceptable. That’s why OnCrawl integrates a state-of-the-art linguistic approach based on the Damerau-Levenshtein distance.
- Filter your near duplicates by similarity ratio
- Detect the similarity threshold at which duplication can hurt your SEO
- Perfect your canonical management rules