3.9 Advanced Tools
Advanced tools focus on the finer details of SEO including internal linking, redirections and Moz analysis.
3.9.1 Automatic Linking
This section lets you set specific keywords to always link to content on your site, or even a different site altogether.
For example, you could set your site to link to https://wordpress.org/news/ any time you type ‘WordPress news’. Without this plugin, you would have to manually create these links each time you write the text in your pages and posts – this can save you a bunch of time.
Start by enabling the module:
Post types
Here you can select the post types for which you want to enable auto-linking and post types or taxonomies that can be linked to.
Insert Links
The Insert Links option enables you to select exactly which post type(s) the plugin should automatically insert links in. Every post type active on your site will be available for keyword linking here.
Link to
The Link To option lets you specify the post types & taxonomies that can be automatically linked to. For example, if you have a “Really Cool Stuff” category on your site, the plugin could automatically link to that category archive any time it finds “really cool stuff” on your site.
Custom Links
The Custom Links enables you to really tweak things. If there are any keywords or key phrases that you want to automatically link to specific URLs, you can enter them here.
Click the Add Link button. This will pop open a ‘Add Custom Keywords’ modal. Enter the custom keyword(s) in the Keyword Group field (separated by commas) and the URL you want the keywords to link to in the Link URL field.
Click Add to add the entered custom keywords and links. Click the gear icon to the right of any custom keywords list item to reveal additional options to edit or remove the added custom keywords.
Exclusions
The Exclusions setting enables you to enter any keywords that the plugin should not use for links. This can be very handy if you notice certain words or phrases are linking to places you don’t want them to or if there are certain areas of your site you don’t necessarily want to be linked to (the “Uncategorized” category for example).
Excluded keywords
Enter the keyword(s) you want to exclude from auto-linking in the input field (separated by commas).
Excluded posts/pages/media
You can also specify Posts, Pages, or Media files you want to exclude from auto linking by pressing the Add Exclusion button.
This will pop open a ‘Add Exclusion’ modal where you can select specific posts, pages or media files to be excluded from auto-linking. Click Add to add the selected posts/pages/media to the exclusions.
All the added exclusions are listed here. Click the Trash icon to the right of any added exclusion to remove it.
Min Lengths
Sets the minimum length for your post types and taxonomies. Just enter the desired number and all posts and/or taxonomies with titles/length shorter than that number will be ignored.
Max Limits
Max Limits enables you to setup a maximum number of links per single post or you can limit a total number per keyword group that will appear on your site.
Per Keyword Group example: If you want a keyword to link to mysite.com/smartcrawl page and you set this to 10, it will link up to 10 times in total.
Optional Settings
These options enable you to set up everything down to the smallest detail. Let’s go over each one.
Allow Autolinks To Empty Taxonomies
Sometimes you want to have your autolinks point to empty taxonomies. You can use this for a number of reasons:
- Affiliate links (nofollow, noindex).
- Deep linking into your archive content (best use).
- Cross-linking to partner sites (with caution).
For example, you have a Category “Cats” but you still did not write anything about Cats – this will help you set links for future posts.
Prevent Linking in Heading Tags
This will prevent linking in heading tags. No worries, h1, h2, h3 and h4 tags are exempted by default, so this setting only affects h5 and h6 headings.
Process Only Single Posts And Pages
Enabling this option will ensure that autolinking does not occur in places like archives, post excerpts or search results pages.
Process RSS Feeds
Process RSS Feeds will ensure that links are automatically included in your RSS feed.
Case Sensitive Matching
Case Sensitive Matching will ensure that links are automatically created only if uppercase and lowercase spelling is an exact match.
Prevent Duplicate Links
Prevent Duplicate Links ensures that only the first occurrence of any matched text in any post will be linked.
Open Links in New Tab
This ensures that the target=“_blank” attribute is added to all automatic links so they open in new tabs when clicked.
Nofollow Autolinks
By default, search engine crawlers follow each link on your site until all the site is crawled. This can be used to stop them from following dynamic URLs that lead to the same/similar content on your site or if you do not wish to share your link juice to external URLs.
Prevent linking on no-index pages
Enable this option to prevent automatic linking on pages that are set to no-index.
Prevent linking on image captions
Enable this to prevent links from being added in image captions.
Prevent caching for autolinked content
Enable this option to prevent object cache conflicts with some page builder plugins and themes when automatic linking is enabled.
As always, once you’re done – press the SAVE SETTINGS button.
In The Post Editor
You can also manipulate the behavior of individual links in the Gutenberg post editor by adding rel attributes.
The following attributes can be applied to specific links:
- Open in new tab – Marks the link to open in a new tab.
- Sponsored link – Marks the link as a paid placement or an advertisement.
- User generated content – It is recommended to mark user-generated content with this rel attribute. This is generally content that has been posted by users, like comments or forum posts.
- Nofollow – Use this when you don’t want the linked page to be associated with your site or you’d prefer it if Google didn’t crawl the linked page.
Filters
Increase Autolink Limit
By default, the maximum number of automatic links that the plugin will create in all post_types combined is limited to 2000.
You can override that default and set a custom limit by adding this filter to your child-theme’s functions.php file or a mu-plugin, and adjusting the value to what you need:
define( ‘SMARTCRAWL_AUTOLINKS_GET_POSTS_LIMIT’, 3000 );
3.9.2 URL Redirection
Redirection enables you to forward one URL to another. It’s a handy way of sending both users and search engines to a different URL and allows you to preserve your search engine rankings for a particular page.
Use the dynamic Search field to locate specific redirects that you’ve already set up. Just start typing and the search will display all matches as you type.
To Edit or Remove an existing redirect, click the gear icon in its row and select the appropriate option.
Redirection is also a useful way to preserve the “link juice” of out-of-date content by redirecting old pages or posts to new ones with new information.
If you trash a published post or page, a notice will appear at the top of the All Posts/Pages screen to remind you to create a redirect for that old URL. Copy the slug shown in the notice, then click the Add redirection link to be taken to the Add a redirect screen. Note that a future update of the plugin will ensure the slug gets automatically added for you so you won’t need to copy it.
Add a redirect
To add a new redirect, click the Add Redirect button at the top of the screen to open the options modal.
Redirect Tab
- Redirect From – Enter the absolute or relative URL that you want to redirect from.
- Redirect To – Enter the destination you want to redirect to. This can be a manually entered absolute or relative URL, or you can start typing the title of the page, post or custom post type you want, and select it from the results found.
- Redirect Type – Select the type of redirect you want to apply if different from your Default Redirection Type.
- Location-based Rules – This option enables you to selectively redirect users based on their geographical location.
Location-based Rules
Enable this feature if you wish to add rules to redirect users based on their geographical locations. Note that this feature requires you to sign up for a MaxMind account. See Location-based Rules in the Settings chapter below for instructions on getting your Maxmind key.
Toggle on the option to Disable default Redirect To if you wish to redirect users based on their location only.
Then Click the + Add Rule button to add a new location rule.
- Rule – Select whether you wish to redirect users who are From or are Not From the countries you specify.
- Countries – Select the country or countries this rule should apply to.
- Redirect To – Enter the URL where users who match this rule should be redirect to.
You can add more than one location rule for a redirect, and their precedence will follow the top-to-bottom order. The rule added first has higher precedence over the rule next to it.
If you add one or more conflicting location rules, a notice will appear to remind you so you can adjust.
To delete an existing rule, click on the Trash icon next to it.
Advanced Tab
Here you can add a label for the redirect and also choose how the redirect URLs should be treated.
- Label – Add an optional label to your redirect if you have long URLs or several similar ones so you can easily tell them apart.
- Regular Expressions – Select if the redirect from/to URLs should be treated as plain text or regular expressions. If your URL contains a regex pattern, be sure to select the Regex option. See About Regex Redirects below for more.
About Regex Redirects
SmartCrawl supports regex patterns in your redirects. Regex is short for regular expression, which is a handy shorthand that can be used to match numerous URLs using a single string of text.
IMPORTANT
Regex redirects must use absolute URLs, like https://yoursite.tld/cats/(.+). They will not work with relative URLs, like /cats/(.+)
To illustrate how powerful regex can be, let’s say you wanted to catch all URLs containing the distinct word cats anywhere in the URL, and redirect all of them to a /cats-new/ post instead. For example:
yoursite.tld/cats
yoursite.tld/old-cats
yoursite.tld/cats-here
yoursite.tld/old-cats/here
yoursite.tld/even-more/old-cats
yoursite.tld/still/more/cats/here/too
You could create a separate redirect for each one, but if you have dozens or hundreds of such URLs, that can get quite tedious. The easy alternative is to “catch” all of them with a regex pattern like so:
yoursite\.tld\/(?!cats-new\/|cats-new$).*\bcats\b.*
This pattern would match all of the above URLs. But it would not match any of the following URLs, as the letter combination c-a-t-s in each one is actually part of a different word.
yoursite.tld/ducats
yoursite.tld/catsup
yoursite.tld/scats
It would also not match the URL you’re redirecting to, which also contains the word cats, as that would cause an endless redirect loop:
yoursite.tld/cats-new
Here’s a quick breakdown of how the above regex works.
The first part of the expression checks if the URL is exactly where you want to redirect all your stuff (in our example, that’s /cats-new/) and if it is, do not redirect. That little bit of magic happens with a negative lookahead – ?! – to exclude that particular pattern.
(?!cats-new\/|cats-new$)
The second part of the expression says to match any combination of characters (the .* tokens) before and/or after the explicit word cats (indicated by enclosing the word in \b tokens).
.*\bcats\b.*
Note that regex is also supported in the target or destination URL that you’re redirecting to.
For example, let’s say you just changed a post category on your site from cats to felines, but have a ton of old cats links that your users have shared on social media.
You could create a redirect to catch all those inbound links, like so:
https://yoursite.tld/cats/(.+)
Where anything after cats in the URL is identified as a capture group by enclosing it in parentheses. The (.+) will match any number of any characters, so should catch all old links to posts in that old category.
Then redirect them to the corresponding new category URLs, like so:
https://yoursite.tld/felines/$1
The $1 token will match whatever is in the capture group in the old URLs.
If you’re new to regex and want to get a feel for the basics, see this excellent article: Regular Expressions for Non-Programmers.
We suggest using a handy tool like this one to help you build & validate your regex patterns: https://regex101.com/
Bulk Options
You can also bulk delete or bulk update multiple redirects that you’ve already set up. To bulk update multiple redirects, select them from your list and click the Update button.
In the modal window that pops open, enable any or all options corresponding to the values you wish to update.
- Redirect URL – The new URL where all selected items will be redirected.
- Redirect Type – The redirect type to apply to all selected items.
- Location-based Rules – The location-based rules to apply to all selected items.
The available bulk options function exactly as when adding a new redirect. See Add a Redirect above for details.
Note however that you cannot apply regex rules when bulk updating existing redirects. To apply regex rules, you would need to edit each redirect individually.
Import & Export
This feature enables you to export a JSON file with all your redirects, including location rules, and import it later on other sites where the same relative URLs are used.
Note that CSV files of redirects without location rules that were created in prior versions of the plugin can still be imported. But any new imports & exports with location rules must be in JSON format.
To export your redirects, click the Export button at the top of the screen to download a file of all redirects from your site.
To import redirects into a site, click the Import button. In the modal window that pops open, click the Upload button and navigate to the file on your computer that you wish to import, then click Import.
Pagination
If you have more than 10 redirects set up, a pagination option will appear where you can select to display 10, 20, 50 or 100 redirects per page.
Delete All
If you wish to delete all your redirects in a single click, click the Delete All button at top-right. A modal window will appear prompting you to confirm the action.
Settings
Redirect Attachments
Did you know that each time you upload a file to the WordPress Media Library and choose Link to: Attachment Page, WordPress creates a separate media attachment page for every single file? This page contains nothing except the media content and has its own generated URL.
In most cases, this page isn’t particularly beneficial, which is why you might want to redirect WordPress attachments to their original files.
Separate media pages may work for photographers and graphic designers, as they help to create galleries but, for an average WordPress user, it makes sense to redirect WordPress attachments to their original files (improving SEO in the process).
Let’s say you create a post and add three images as Link to: Attachment Page to it. WordPress then automatically creates four URLs, three for the images, and one for the original post. This can hurt SEO in more ways than one:
- Google may start bringing more traffic to the attachment pages instead of the original post to which they belong. Like 404 errors for example
- This standalone attachment appears out of context, and a visitors landing on the image or attachment page are likely to close the link and move away.
- It’s possible that Google may index all the image files and consider it as duplicate content.
Finally, there’s also a niche situation where access to content in pages and posts is restricted by a password. It may happen that someone shares your images on social media. By clicking on the image URLs, an unauthorized visitor may be able to access the media content within these posts or pages despite not knowing the password.
You can help your readers skip these attachment pages by redirecting them to the original attachment files by enabling the options here.
Keep in mind that this redirect option will only work if the media item was uploaded to the original post in the first place.
Redirect Image Attachments
You can also choose to redirect only the image attachments to their original files and other attachments to their respective parent posts by enabling the Redirect Image Attachments Only checkbox.
Default Redirection Type
Here you can select the redirection type that you would like to be used as the default. There are various types of redirection, each serving a unique purpose.
301 Permanent Redirect
This is used when a URL has permanently moved to a new location. Search engines and browsers update their records, and users are automatically redirected to the unique URL. This is a permanent redirection, and it’s important for SEO as it passes on the SEO value from the old URL to the new one.
302 Temporary Redirect
This redirect is used when a URL has been temporarily moved to a different location. It’s often used for short-term changes. However, in terms of SEO, search engines may continue to index the original URL, and the SEO value might not pass on to the new URL.
307 Temporary Redirect
This redirect is used for temporary moves, preserving the original HTTP method. It’s similar to a 302 redirect but emphasizes preserving the technique differently. You can use it to ensure that the original form (e.g., POST) is retained when the client is redirected.
410 Content Deleted
This status code indicates that the URL is no longer available and there is no replacement. When search engines encounter a 410, they may de-index the page more quickly than other status codes, which signifies that the content is intentionally gone.
451 Content Unavailable for Legal Reasons
This status code indicates that a web resource is inaccessible due to legal obligations, such as government-mandated censorship or copyright issues. It’s used to inform users and clients about content restrictions based on legal grounds.
Location-based Rules
Location-based redirection enables to set up rules to redirect URLs based on the site visitor’s geographical location.
Location-based redirection requires you to sign up for a MaxMind account to get the license key, which is free, although paid services are available.
To sign up, click the Create a free account link in the Location-based rules module.
Complete the MaxMind GeoLite Sign Up form, then click Continue.
MaxMind will send an email containing verification information. Follow the directions in the email to verify and activate the account.
After a successful verification, log in to your MaxMind account.
In the menu on the left, click Manage License Key. Then, click on Generate New License Key.
In the modal, enter a name for your license key and click Confirm.
The License Key will be generated and displayed. Copy and paste the generated license key in the MaxMind License Key field and click Download.
A notice will pop up to remind you that it takes up to 5 minutes for Maxmind to activate your new key.
Once you have clicked the Download button and the license key has been activated, you can start adding the redirect rules, as seen in Location Rules above.
Deactivate
If you no longer need the URL Redirection feature, you can click the button here to deactivate it. Deactivating the feature will disable all existing redirects, but will not delete them.
3.9.3 WooCommerce SEO
WooCommerce adds basic schema data by default when activated on a site. When SmartCrawl detects WooCommerce on your site, you can enable this feature to enhance that basic SEO configuration.
However, if you want full control over all specifics of your WooCommerce Product schema, you can create a product type in the Types Builder and fine-tune everything there instead.
Click Activate to begin.
Once activated, the following WooCommerce configuration options are available.
Improve Woo Schema
- Brand – This option enables you to select a default taxonomy to use as the Brand schema for all products. If you specify a different brand taxonomy for any specific WooCommerce Product type in the Types Builder, that taxonomy will override the default set here.
- Global Identifier – This option enables you to specify a global GTIN or MPN to use throughout all products. When enabled, the selected identifier will be added as an input field under the Inventory tab in the product editor. If you add the same identifier as a Property to any specific WooCommerce Product schema in the Types Builder and select a different value, that will override what you set here.
- Enable Shop Schema – When enabled, this option will automatically add the CollectionPage markup to your WooCommerce shop and product archive pages to help search engines better understand that there are several items on the pages.
Improve Woo Meta
- Enable Product Open Graph – Enable this to add product price and currency meta to Open Graph if that is also enabled in Title & Meta > Post Types > Products.
- Remove Generator Tag – Enable this option to remove the WooCommerce generator tag from your site: <meta name=”generator” content=”WooCommerce x.x.x” />
Restrict Search Engines
- No-Index Hidden Products – If you have set Catalog Visibility to hidden in your WooCommerce settings, enable this option to prevent search engines from indexing your product pages.
- Disallow Crawling of Cart, Checkout & My Account Pages – Enable this to automatically add the following entries to your Robots.txt file when using the Robots.txt Editor feature:
- Disallow: /add-to-cart=
- Disallow: /cart/
- Disallow: /checkout/
- Disallow: /my-account/
3.9.4 Moz
Moz is the industry leader in SEO reports, and we make it easy to integrate with their API. Note that configuring this is entirely optional.
To take advantage of the Moz reporting tools you need Moz API Access ID and Secret Key.
To get your Moz API credentials, create a new Moz Pro account or sign in to your existing account.
After creating a Moz Pro account, navigate to https://moz.com/products/api/pricing to create a Moz API account.
Opt for a free or paid plan, enter the required information, and complete the Moz API account creation. After a successful Moz API account creation, the subscription summary will be displayed.
Now, to create or access your Moz API credentials, navigate to the API dashboard and click on the Add Token button.
Enter an identifier for the token and click Create to generate credentials.
To view the generated API credentials, click on the Show Legacy Credentials link.
Copy the Access ID and Secret Key.
Paste the generated API credentials in the Moz integration modal and click Connect.
After a successful connection, it’ll take a few minutes to see metrics specific to your site (in a multisite install, metrics specific to each site in your network appear in the dashboard of each site). You can also see individual stats per post in the post editor under the SEOmoz URL Metrics module.
You’ll find a wealth of information about good SEO practices, and details about your site metrics, by visiting the Moz Learn Section.
Moz API Subscription & Usage Quota
Note that the Moz API account is separate from the standard Moz account. If you already have a Moz account, make sure to create a Moz API account to create/access the API credentials.
Additionally, ensure you have an active API subscription and that your account has not exceeded its usage quota. You can check your current usage in the API dashboard and your API subscription on the subscriptions page.
Deactivate
If you no longer need the Moz feature, you can click the button here to deactivate it. Note that deactivating the feature will reset your Moz credentials.
3.9.5 Robots.txt Editor
This tool enables you to directly edit the robots.txt file for your site without having to go into the file system via FTP or cPanel.
Robots.txt is used to tell web crawlers what they should or should not index. For example, you could include directives in your robots.txt file to prevent Google and other search engines from indexing certain files on your website (images, PDFs, etc.) so they don’t appear in search results.
Start by enabling the module:
IMPORTANT
This tool cannot be used to edit an existing physical robots.txt file. If such a file already exists, the Activate button will be replaced by a message alerting you that you will need to remove that file before proceeding.
Output
The top section here will give you a link to the location of the virtual robots.txt file created by the plugin and show you the current contents of the file that search engines will see.
Include Sitemap
This setting enables you to automatically include the URL to your site’s sitemap.xml file in your robots.txt file. It’s a good idea to ensure this option is enabled, and the sitemap.xml URL is entered here, so search engines know where to find that file as they use it to crawl your site.
Note that if you have enabled the Sitemap module in SmartCrawl, this URL will be automatically filled in for you and cannot be changed.
Customize
This is where you can add any directives you need in your robots.txt file to instruct web crawlers on what they should do on your site.
By default, this section only contains the following, which allows all user agents (search engines) to access and index all content:
User-agent: *
Disallow:
- User-agent: * means all ( * ) search engines.
- Disallow: with nothing after, it means nothing is disallowed, so access to all content is allowed.
You can add any additional directives you need in this section but can leave it as-is if you don’t need or want to restrict web crawler access.
Customization Examples
If you want to prevent search engines from indexing WooCommerce cart & checkout pages on your site, you could add rules like these to your robots.txt file:
Disallow: /*add-to-cart=*
Disallow: /cart/
Disallow: /checkout/
Disallow: /my-account/
If you’re having trouble with bots scanning your WooCommerce pages, adding products to the cart and slowing down your site, you could add rules like these to your robots.txt file to block them:
User-agent: *
Disallow: /wp-admin/
Disallow: /?s=
Disallow: /search
Disallow: /wp-json
Disallow: /cart
Disallow: /wishlist
Disallow: /checkout
Disallow: /my-account
Disallow: /*?orderby*
Disallow: /*?filter
Disallow: /*add-to-cart=*
Disallow: /*?add_do_wishlist=*
Allow: /wp-admin/admin-ajax.php
Crawl-delay: 10
For more information on robots.txt directives, please visit this handy article at moz.com https://moz.com/learn/seo/robotstxt
Google and Wikipedia also have some good articles to help out if needed:
https://support.google.com/webmasters/answer/6062608?hl=en
https://support.google.com/webmasters/answer/6062596?hl=en
https://en.wikipedia.org/wiki/Robots_exclusion_standard#About_the_standard