Automating Sitemap Submissions to Google & Bing using Sitecore Powershell Extensions

Keeping Search Engines apprised of the content updates(additions/modifications/removals) is crucial for businesses to amplify user acquisition and to improve user experience. While Google’s scheduled recrawl may take up to weeks for scanning certain sites again, it is recommended to notify Google of essential content updates early to drive the needed traffic to gain maximum possible value.

NOTE: If the site holds short-lived content like live events, jobs, etc., you will also be able to benefit from Google Indexing API. This blog describes how to integrate Google Indexing API with Sitecore Publish.

With Sitecore Powershell Extensions, Sitemaps can be submitted to Search Engines’ Ping services quickly and flexibly.

STEPS TO CREATE POWERSHELL MODULE
Powershell Module(Eg: ‘SEO’) can be created under /sitecore/system/Modules/PowerShell/Script Library using ‘Module Wizard’ Insert Option,

Ensure to select the ‘Content Editor’, ‘Tasks’ and ‘Shared Functions’ Integration Points while creating the module,

Remove the items under Ribbon item. Build the following tree structure under SEO module item based on templates/insert-options indicated below,

  • Content Editor
    • Ribbon
      • SEO (PowerShell Script Library)
        • Sitemap (PowerShell Script Library)
          • Submit (PowerShell Script)
  • Functions
    • Invoke-PingService (PowerShell Script)
    • Submit-Sitemap (PowerShell Script)
  • Tasks
    • Submit Sitemap (PowerShell Script)

 

Below PowerShell Scripts shall be copied into the ‘Script body’ field of above created ‘PowerShell Script’ items,

Ensure to update the live site’s sitemap URL(s) within the above Submit-Sitemap function.
Navigate to ‘PowerShell ISE’ from Sitecore LaunchPad. Select the ‘Settings’ tab and choose ‘Sync Library with Content Editor Ribbon’ from ‘Rebuild All’,

‘Submit’ option should now be available within the ‘Sitemap’ Chunk of the new ‘SEO’ tab, using which Content Authors can make on-demand Sitemap submissions to Google, Bing, Yahoo, and DuckDuckGo Search Engines.

STEPS TO CREATE POWERSHELL SCHEDULED TASK
A Powershell Scripted Task can be created under /sitecore/system/Tasks/Schedules from Insert Options to automate submissions at a scheduled frequency,

The schedule frequency can be anywhere between few hours to few days for the site depending on the frequency of content updates happening and how frequently Google crawls the site (can be identified from the Google Search Console). Ideally, the Google bot crawl rate shouldn’t affect the site performance. If needed crawl rate can be optimized as per this documentation.

Bing’s index covers Yahoo and DuckDuckGo search engines, hence they do not require dedicated sitemap submissions.

Ensure that the lastmod field for the URL(s) are up-to-date in Sitemap file, as Google uses this field to determine if an URL is modified and if it requires crawling. It may take few minutes to few hours for the Search Engine bots to crawl the site once the request is submitted. Once Search Engine starts crawling sitemap, ‘Last read’ value gets updated with the current date in the Google Search Console,

Happy Crawling!

Request Google to crawl URLs on Sitecore Publish using Google Indexing API

Most of the sites hold a bunch of short-lived content like Events, Job Posting. While reaching the intended audience for short-lived content is challenging, removing the expired content from Search Engine is also vital for user engagement. This can be solved by bridging Google Indexing Mechanism with Sitecore Publish Mechanism using Google Indexing API, which empowers businesses to gain maximum value by reaching the right users at the right time.

IMPORTANT: Google Indexing API allows automating of Google Indexing only for short-lived pages like job postings or live events currently.

STEPS TO CONFIGURE INDEXING API

Create a Google API project using this Setup Tool
Navigate to API Dashboard of the newly created project, and select ‘ENABLE APIS AND SERVICES’

Search for ‘Indexing API’ and Enable the same for the project

Navigate to Credentials Tab, and create credentials for the project

Navigate to Credentials Tab, and select ‘Manage Service Accounts’

Select ‘CREATE SERVICE ACCOUNT’ button to create a new Service Account which will be used for sending indexing requests to Google,

Select ‘Actions’ ->  ‘Manage Keys’ to create new JSON API Key.

Store the downloaded JSON file safely, it is required to send Indexing requests to Google

Navigate to Google Search Console and then to the respective property. Select ‘Settings’ -> ‘ADD USER’ and add the Service Account(created earlier).
Select ‘Actions’ button of any existing Owner Account and select ‘Manage Property Owners’ to add Service Account as Owner to the Google Search Console property(only verified owner accounts can initiate indexing requests to Google)

STEPS TO INTEGRATE INDEXING API

This integration requires Google.Apis.Indexing.v3 Nuget Package, which needs to be added to the project(Depending on the Sitecore Version, you may also want to update the ‘oldVersion’ attribute of ‘bindingRedirect’ configured for ‘Newtonsoft.Json’ in web.config to 0.0.0.0-12.0.0.0 as Google API looks for Newtonsoft.Json 12.0.0.0)

Pages that were created/updated during Publish or Workflow Approval operations shall be captured by adding a custom processor within Publish pipeline and sent to Google as below,

An Event Handler for item:deleting event shall be added to capture the deleted page links and shall be sent to Google as below,

The above processor and event depend on the IndexingAPIHelper.cs and ItemExtensions.cs which needs to be added to the solution.

Copy the JSON file downloaded during the setup process to the website root folder and update the file name in GetGoogleIndexingAPIClientService method of IndexingAPIHelper class accordingly

The configuration is now complete! Indexing Requests for added/updated/deleted content will be sent to Google upon publishing. Ensure that the respective pages follow Google Structured Data Standards (JobPosting, BroadcastEvent). 

Indexing API requests can be monitored from Indexing API Metrics Tab

Please note that the default quota for Indexing requests is 200, you may want to request for a higher quota following the steps described here. Quota usage can be viewed from Indexing API Quota Tab.

Source Code is available in Github. Please do share your feedback below.

Happy Indexing!