Request Google to crawl URLs on Sitecore Publish using Google Indexing API

Most of the sites hold a bunch of short-lived content like Events, Job Posting. While reaching the intended audience for short-lived content is challenging, removing the expired content from Search Engine is also vital for user engagement. This can be solved by bridging Google Indexing Mechanism with Sitecore Publish Mechanism using Google Indexing API, which empowers businesses to gain maximum value by reaching the right users at the right time.

IMPORTANT: Google Indexing API allows automating of Google Indexing only for short-lived pages like job postings or live events currently.

STEPS TO CONFIGURE INDEXING API

Create a Google API project using this Setup Tool
Navigate to API Dashboard of the newly created project, and select ‘ENABLE APIS AND SERVICES’

Search for ‘Indexing API’ and Enable the same for the project

Navigate to Credentials Tab, and create credentials for the project

Navigate to Credentials Tab, and select ‘Manage Service Accounts’

Select ‘CREATE SERVICE ACCOUNT’ button to create a new Service Account which will be used for sending indexing requests to Google,

Select ‘Actions’ ->  ‘Manage Keys’ to create new JSON API Key.

Store the downloaded JSON file safely, it is required to send Indexing requests to Google

Navigate to Google Search Console and then to the respective property. Select ‘Settings’ -> ‘ADD USER’ and add the Service Account(created earlier).
Select ‘Actions’ button of any existing Owner Account and select ‘Manage Property Owners’ to add Service Account as Owner to the Google Search Console property(only verified owner accounts can initiate indexing requests to Google)

STEPS TO INTEGRATE INDEXING API

This integration requires Google.Apis.Indexing.v3 Nuget Package, which needs to be added to the project(Depending on the Sitecore Version, you may also want to update the ‘oldVersion’ attribute of ‘bindingRedirect’ configured for ‘Newtonsoft.Json’ in web.config to 0.0.0.0-12.0.0.0 as Google API looks for Newtonsoft.Json 12.0.0.0)

Pages that were created/updated during Publish or Workflow Approval operations shall be captured by adding a custom processor within Publish pipeline and sent to Google as below,

An Event Handler for item:deleting event shall be added to capture the deleted page links and shall be sent to Google as below,

The above processor and event depend on the IndexingAPIHelper.cs and ItemExtensions.cs which needs to be added to the solution.

Copy the JSON file downloaded during the setup process to the website root folder and update the file name in GetGoogleIndexingAPIClientService method of IndexingAPIHelper class accordingly

The configuration is now complete! Indexing Requests for added/updated/deleted content will be sent to Google upon publishing. Ensure that the respective pages follow Google Structured Data Standards (JobPosting, BroadcastEvent). 

Indexing API requests can be monitored from Indexing API Metrics Tab

Please note that the default quota for Indexing requests is 200, you may want to request for a higher quota following the steps described here. Quota usage can be viewed from Indexing API Quota Tab.

Source Code is available in Github. Please do share your feedback below.

Happy Indexing!

In-Sitecore Alerts & Push Notifications: Effectively communicating Maintenance Activities to Authors & Marketers

Building a strong DevContentOps within the organization helps to keep the productivity of all teams at maximum. This module eliminates friction and enables seamless collaboration between Authors, Marketers, Developers, and Operations.

The module can be downloaded from the below links,
Sitecore Maintenance Notification-v1.0 – Sitecore Package
sitecore-maintenance-notification:1.0.0.0-1903 – Docker Image

Integrating the above image into CM/MSSQL Containers requires few changes to .env, docker-compose/override, and Dockerfiles of CM & MSSQL Containers, as indicated in this screenshot (build is required for the changes to reflect). Alternatively, Sitecore packages can also be installed in Docker as described here.

Key Benefits:
• Optimized delivery of maintenance alerts through in-sitecore alerts and push notifications(works even if browser is not open) with appropriate auto-expiration
• Introduces Opt-in/Opt-out flexibility for Maintenance Notifications from Sitecore Control Panel.
• Displays Maintenance Page during Maintenance to avoid confusion.
• Displays Maintenance Alerts in User’s Local Timezone.
• Enables smooth handling of any Schedule Changes and Cancellations.
• Sends reminders before outage & Completion Message immediately after warm-up to keep content freeze at a bare minimum.
• Permits modifying Scheduled Alert messages, Reminder & Completion messages, Reminder Duration, etc. quickly from Sitecore

NOTE:
This module is built on top of Sitecore Powershell Extensions (SPE), please ensure SPE(v5.0+) is installed. 

STEPS TO CONFIGURE THE MODULE:
After installing the module, navigate to ‘PowerShell ISE’ from Sitecore LaunchPad. Select the ‘Settings’ tab and choose ‘Sync Library with Control Panel’ from ‘Rebuild All’,

Below changes need to be introduced in web.config of CM instance. As a best practice, this transform file can be added to the solution. If there is any existing CM web.config transform, the below config can be appended to it.

  <location path="sitecore modules/Maintenance Notification/js/serviceworker.js">
    <system.webServer>
        <httpProtocol>
            <customHeaders>
                <add name="Service-Worker-Allowed" value="/" />
            </customHeaders>
        </httpProtocol>
    </system.webServer>
  </location>

Module comes with default VAPID Keys but it is highly recommended to update the defaults for security reasons. New VAPID Keys can be created using web-push npm package. Below command can be executed to generate new Public/Private VAPID key pair,

npm install web-push -g
web-push generate-vapid-keys


Public Key needs to be updated in /sitecore/system/Modules/PowerShell/Script Library/Maintenance Notification/Notification Settings item. Private Key needs to be updated in the DevOps/Windows PowerShell scripts (described in the next section).

Module includes a default Sitecore API Key(works for 9.1+). But it is necessary to clone/create a new ‘OData Item API Key’ item(with same values) under /sitecore/system/Settings/Services/API Keys of master database for v9.1+ (make sure to publish to web). Item needs to be created under /sitecore/system/Settings/Services/API Keys of core database if v9.0.

For Impersonation User, it is recommended to create a new user(Eg: sitecore\MaintenanceNotification) with just read access only to the items under /sitecore/system/Modules/PowerShell/Script Library/Maintenance Notification. Item ID of this newly created ‘OData Item API Key’ item will need to be updated in DevOps/Windows PowerShell scripts (described in the next section).

FOR RECEIVING ALERTS:
Content Authors and Marketers will have to just select ‘Subscribe to Scheduled Maintenance Notifications’ of ‘Preferences/My Settings’ section of Control Panel, from normal mode of browsers (not incognito/inprivate) to receive maintenance notifications

FOR SENDING ALERTS:
DevOps team will have two options to send notifications during deployments,
DevOps Integration(Automated):
Module comes with two Powershell scripts,

The scripts require four variables. Since $sitecoreHostUrl, $vapidPrivateKey and $sitecoreAPIKey will be consistent across releases, they can be created as pipeline variables. VAPID Private Key and Sitecore API Key Item ID obtained in the above section needs to be assigned to $vapidPrivateKey & $sitecoreAPIKey. $sitecoreHostUrl will be the CM Domain Url. $maintenanceStartDateTime will need to be supplied as input for every release (Format: yyyy/MM/dd HH:mm EST  Ex:2021/01/18 10:00 EST). Script can be tweaked to accept deployment start date/time in another timezone if needed.

This script works with all the DevOps tools that support Powershell tasks. The initial demo uses Azure DevOps self-hosted agent to deploy to a windows machine.

Windows PowerShell(Manual):
Module includes a Windows PowerShell script (SendMaintenanceNotification(Optimized for Manual Notification).ps1), which can be used for sending/clearing notifications anytime without using DevOps tool.
Script requires deployment start date/time as input for every release, to send notifications accordingly. Sitecore CM Domain Url and VAPID Private Key & Sitecore API Key Item ID (created in the above section), needs to be updated in this script directly for $sitecoreHostUrl, $vapidPrivateKey, $sitecoreAPIKey variables.
This Script will automatically send scheduled alerts, reminders, and completion messages based on the input date/time to all subscribed users. Script will send notifications based on the availability of the site, hence do not close this powershell window till end of maintenance, else reminders and completion messages might be skipped.

The module configuration is now complete!

BACKGROUND:
This module leverages the following popular libraries/modules,

  • Service Worker – will be installed in the browser with on click of Subscribe button for receiving notifications. Service Worker is supported in the recent versions of all latest browsers
  • Web-push NPM Package – used for sending optimized Push Notifications.
  • IDB-KeyVal library served via cdnjs – a tiny js imported by Service Worker into users’ browsers for easy storing/retrieving of values in/from Browser IndexedDB (helps to avoid loss of data upon browser/system restart)

This module also relies on the below Sitecore services to keep the details centralized and editable from Sitecore without impacting security,

  • Sitecore ItemService – for fetching Notification Settings and storing User Subscription information received from Service Worker
  • Sitecore OData Item Service – for reading the Subscriptions from Sitecore during DevOps/Windows PowerShell script execution

TIPS:

  • In the event of maintenance cancellation, running the Manual PowerShell script without specifying any input will automatically clear notifications for all subscribed users
  • If the maintenance time or details change, sending the updated time or details through Windows/DevOps PowerShell will overwrite the previously sent details for all users
  • There are chances that a user might have removed the service worker while clearing browser data. In such cases, PowerShell script will throw subscription expired error with failed user’s subscription endpoint url. This won’t affect notifications for other users, but it is recommended to manually remove the respective subscription json(along with date) from /sitecore/system/Modules/PowerShell/Script Library/Maintenance Notification/Push Subscriptions item.
  • If authors/marketers will need more lead time, adding Agentless jobs with Manual Intervention or Delay tasks will be helpful to communicate well in advance and will also free up agents for other tasks during the wait time. Alternatively, manual approach can be used just for initial communication.
  • Service Worker will require localhost or https. In case if the implementation doesn’t meet this requirement, Invoke-AddWebFeatureSSLTask powershell command might help to get a quick self-signed certificate.
  • Notification Settings can be updated if/as needed in /sitecore/system/Modules/PowerShell/Script Library/Maintenance Notification/Notification Settings,

Source Code for this module is available in Github.

Voila!

Design Considerations and Approaches for Scheduling Recurring Tasks/Workflows

Being one of the key initial steps in the automation journey of the implementation, scheduling tasks is not necessarily easy and straightforward. Web applications have plenty of options to achieve this (Eg: Sitecore Scheduler, Windows Task Scheduler, Azure Logic Apps, Container CronJob, Coveo Push API, Hangfire etc.), but each option comes with pros/cons based on the needs/requirements. There are several basic and advanced considerations that needs to be thought before designing recurring scheduled tasks/workflows,

Changing demands/Extensibility – Certain jobs might require dynamic demands on flow or frequency. The approach needs to be flexible to accommodate the changing demands. Azure Logic Apps will be extremely useful in such cases, considering the extensive configuration options.
Review/Approval – Certain Jobs might need job administrator’s intervention to complete the job and the selected approach needs to accommodate this review/approval.
Logging for tracing issues – It is critical to capture the web job actions and errors, this will be helpful in troubleshooting issues. Log Retention needs to be defined prior and the log storage size needs to be validated during the designing phase.
Alerts on failed jobs – Early detection of issues is a key consideration for any kind of job, appropriate alerts on failure over preferred communication channel like Email, Slack/Teams etc. is highly essential.
Reporting – Based on the criticality of the job, job administrator needs to be intimated about success of job every time or on a daily/weekly basis.
Caching – In scenarios where a recurring job is required to interact with certain processed data on a regular basis, the data can be cached so that the scheduled task need not fetch/process the data every time. Azure APIM & Logic Apps comes handy with extensive caching options. Sitecore CustomCache can also be leveraged.
Triggers & Execution Flexibility – In order to understand the need for any on-demand execution/scheduling, the triggers for the job needs to be analyzed. This will help to determine the best approach that will enable the intended job administrator(Eg: Content Author, Infrastructure Admin etc.) to configure/schedule the job on-demand.
Retry on Failures – Based on the nature/frequency of the job, automating 2-3 retry attempts could be beneficial before alerting a failure
Manual/Automated cancellation – It is common that certain jobs need to be paused/cancelled for a particular period based on internal/external factors. Job administrator needs to be provided with expected permissions for handling such scenarios.
Infrastructure – Job Hosting Platform must be capable of handling the unforeseen load or else it could even result in a site outage.
Concurrency Constraints – Concurrency constraints like number of simultaneous jobs or daily allowed jobs for a user or type of job, need to be pre-defined
Documentation – Documenting the job procedure using Flowcharts/UMLs/Storyboard will not only assist job administrators but will also be supportive during maintenance/troubleshooting
Storage in case of import jobs – Appropriate storage needs to be defined for file-based web jobs for storing the artifacts involved (Eg: Azure Blob Storage, AWS S3 Bucket, etc. )
Need for running jobs on Holidays – Certain jobs need not run outside of business hours or during holidays due to unavailability of data. It is essential to capture these scenarios and schedule accordingly as it might save some bandwidth for the infrastructure.
Permissions – Besides providing permissions to job administrators for handling scheduled task, it is essential to restrict the access for unauthorized people to control/update the job.
Frequency of the Job – Almost all of the available options including Windows Task Scheduler allows scheduling in seconds. Frequency needs to carefully matched with the data availability, to avoid unnecessary load or delays.
Tools and Dependencies – Identification of Tools/Dependencies plays key role in defining the hosting of jobs. Desired dependencies needs to be accessible in a secure way by the scheduled job.

Here are certain key approaches that works well for Sitecore Implementations,

The Sitecore Way
Scheduling jobs in Sitecore comes handy with multiple options including Scheduled Task in Sitecore Interface, Scheduling Agents, SiteCron etc.  Sitecore Powershell & Remoting is a great option when looking to trigger based on external actions and when a flexibility is need to change flow/behavior without development. Utilizing Sitecore Scheduling options when there is no interaction/involvement of Sitecore might adversely affect the CM performance and hence need to be evaluated thoroughly.

The Cloud Way
Azure WebJobs can be utilized for scheduling if the task needs to run in the context of App Services. Otherwise Logic Apps is recommended for automating tasks/workflows. Azure Logic Apps comes with unique url allowing to invoke as and when needed. It also provides options to configure appearance of Reports/Emails. Azure Logic Apps will usually be combined with API Management (and Azure Functions certain times) for additional security (to allow whitelisting/blacklisting, Integration with AD etc.), Load Balancing & Failovers, Caching of responses for certain kind of requests, extensive Monitoring & Telemetry capabilities.
If the implementation uses AWS, Batch jobs, Lambda Functions, API Gateway services etc. can be utilized.

Windows Server
Windows Task Scheduler, being the most common job scheduling mechanism for IAAS/On-Prem instances, it is usually preferred for jobs which doesn’t involve Sitecore interactions and commonly achieved with Powershell. Windows Task Scheduler allows scheduling multiple actions for a specific trigger.

The Search Platform Way
There might be scenarios when external repository data needs to be pulled into Search Platform for presenting on the site as it is (without storing/versioning in Sitecore). In such cases, data can directly be imported into the Search Platform instead of utilizing Sitecore Scheduler to reduce load on CM, especially when the job is expected to process multitudes of data. SOLR allows importing data directly with update/dataimport handler and can be scheduled to run automatically with Powershell/Curl (which can be automated with the scheduling options available with Hosting Platform). For Coveo-based implementations, the same can be achieved with Out-of-the-box option or Coveo Push API and Scheduling.

The Container Way
It is certainly possible to run scheduled jobs from Docker/Kubernetes using Cronjob API and other open-source add-ons like Tasker, Ofelia etc. Image for the Cron Job needs to be built on top of base image(s). This option works well when the scheduled job is dependent only on the components within the container.

With the wide range of scheduling options available, spending time in evaluating the options & designing the job scheduling is crucial to achieve effective long lasting solutions.

Happy Scheduling!

Achieving End-User Visibility and Traceability for Sitecore Implementation with Azure AppInsights

It is quite common that an application or its integration doesn’t behave uniformly at all circumstances for all users, browsers, devices, regions etc. Most of the Performance issues and End-user exceptions are detected only after it is reported by an end user while it might be too late as it would have affected the user engagement and retention already.

Channel 9 Reaction GIF by LEGO Masters Australia (GIF Image)

DevOps needs automated feedback from end-users to facilitate a stunning experience for them

Setting up real-time instrumentation is a key solution that allows us to gain end-user visibility and traceability to obtain performance and experience feedback of our site in real-time from all users, browsers, devices and all possible segments. Depending upon the current platform and level of visibility/feedback needed, there are several real-time instrumentation solutions available ranging from Pingdom RUM, Site24x7 RUM etc. to New Relic, Appdynamics etc.

AppInsights JS SDK:
AppInsights JS SDK allows capturing client-side experience in Azure AppInsights. This Open-source SDK helps recording several critical insights,

  • Real-user page load times
  • Real-user load times of AJAX calls, js/css, media, external dependencies etc.
  • External dependency load failures and status code
  • Client-side Exceptions, Stack Trace
  • Request/response headers for AJAX requests, js/css, external dependencies etc.
  • User time spent on pages
  • User Device, OS etc.
  • User IP, Location etc. (if enabled)
  • Correlation between client-side and server-side requests for an operation
  • Browser Link Requests tracking

AppInsights JS SDK also allows recording of custom attributes as well (Eg: User Profile Attributes, Segments etc. but make sure to review your Compliance Strategy – GDPR, CCPA etc.) in AppInsights, which is a key differentiator. Telemetry capturing can be disabled for certain users based on custom conditions as well.

Setup & Configuration:
AppInsights JS SDK supports both Snippet-based Integration and NPM-based Integration.

If the implementation leverages NPM (Eg: JSS), NPM-based Integration can be preferred. The below command shall be used for installation of the module,

npm i --save @microsoft/applicationinsights-web

The below command shall be used if looking for light version,

npm i --save @microsoft/applicationinsights-web-basic

The javascript code that needs to be integrated within layout file to allow it to render on all pages shall be found here.

Snippet-based Integration needs to be opted when the implementation doesn’t use NPM. The javascript snippet that needs to be added to the layout can be found here. It is recommended to keep it as high as possible in the head tag to allow it to capture the errors occurring in all the dependencies. HTML Snippet component could be used for Sitecore SXA based implementations.

Instrumentation key for AppInsights can be found from the respective AppInsights resource,

Depending upon the no. of Environments & DevOps process for the implementation, Instrumentation key could be stored in config, environment variables, Azure Application Settings, CI/CD Pipeline Variables etc. if/as needed.

Once the setup is complete, AppInsights JS SDK starts sending the data from the user browser to Azure AppInsights and can be verified from the Network tab,

Data will be recorded in Azure AppInsights within 2-3 mins usually.

Page Views table in Azure AppInsights which will usually be empty, will now see real-time records along with any custom data posted,

Dependencies table will start recording the AJAX calls, dependencies etc.

Exceptions table will now start recording client-side exceptions and failures as well,

Requests, Dependencies and Exceptions tables will also continue to capture server-side application monitoring data as usual.

With AppInsights JS SDK’s unique operation id and custom ids (Eg: Session Id etc.), the client-side page views, dependencies and exceptions can easily be correlated with server-side dependencies/requests, facilitating early identification, quicker tracing & faster resolution of critical production end-user issues and thereby providing a stunning experience for all customers.

Happy Monitoring!

Accelerating Workflow Process with Sitecore + Microsoft Teams Integration Module

It is highly important to equip Content Authors with required tools/integrations that will help them to accomplish tasks with minimal clicks possible. When it comes to involvement of more than one person (Content Author, Content Approver), there is a high possibility of delays in taking updates live that might be time-sensitive due to expected/unexpected communication delays. Automating this communication effectively through the business communication/collaboration platform along with quick actions(approve/reject), will not only avoid the delays but also help to optimize time & work. This modules allows seamless integration of Sitecore Workflow with Microsoft Teams (which is now replacing Skype).

The module can be downloaded from the below links,
Sitecore – Microsoft Teams Integration for SPE 6.x
Sitecore – Microsoft Teams Integration for SPE 5.x

NOTE:

  • This module is built on top of Sitecore Powershell Extensions (SPE), please make sure SPE is installed before installing this module
  • This module uses Sitecore Feed Urls, hence Approvals reside behind Sitecore Security and the links will prompt the Content Approver to authenticate before approving/rejecting the updates (if not logged in already)

STEPS TO CONFIGURE THE MODULE:

GIF by joelremygif (GIF Image)

Creating Microsoft Teams Channel:

  • Navigate to ‘Teams’ section of Microsoft Teams from left navigation bar and click on ‘Join or Create a Team’ Button
  • Select ‘Build a team from scratch’ (or) ‘Create from an existing team’ to create the team/channel
  • Provide Name (preferably ‘Workflow Moderators’) & Description for the Team and add required Content Authors & Approvers

Adding WebHook:

  • Once the team/channel is created, click on the ‘More Options’ (three dots) button available in the top right corner and click on ‘Connectors’
  • Search for ‘Incoming Webhook’ in the opening popup and click on ‘Add’ button
  • Add Name(preferably ‘Sitecore Workflow’) and Image, and click on ‘Create’ button
  • Copy the ‘WebHook URI’ which is appearing before closing the window
  • Navigate to \App_Config\Include\Sitecore+MicrosoftTeams.Integration.config file and paste the copied WebHook URI within value attribute of ‘Teams.WebhookURI’ setting,

Setting up Workflow Notifications:

  • Navigate to the Workflow for which the Notifications have to be setup
  • Right-Click Submit Action of Draft State, click on Insert -> PowerShell Action and Enter name as ‘Awaiting Approval Notification’
  • Select ‘Script Library/Sitecore + Microsoft Teams Integration/Workflow/Awaiting Approval Notification’ for ‘Script’ field in the newly created ‘Awaiting Approval Notification’ item
  • Similarly, add PowerShell Actions under Approve & Reject actions and choose ‘Script Library/Sitecore + Microsoft Teams Integration/Workflow/Approved Notification’ & ‘Script Library/Sitecore + Microsoft Teams Integration/Workflow/Rejected Notification’ for ‘Script’ field respectively
  • If you’re using Sitecore Powershell Extensions version 6.1.1 or older, you will be needing to update the ‘Type string’ field value as ‘Sitecore.MicrosoftTeams.Integration.Workflows.ScriptAction, Sitecore.MicrosoftTeams.Integration’ (this ScriptAction class comes with the module) for ‘Approved Notification’ and ‘Rejected Notification’ items in case if you face the error described in https://github.com/SitecorePowerShell/Console/issues/1204

The configuration is now complete!
Awaiting Approval, Approved and Rejected Notifications should now be appearing in the configured Teams Channel.
Voila!

Come Here Lets Go GIF by Luis Ricardo (GIF Image)

Investigating Audit Logs with Sitecore Powershell Reports

Very often Content Administrators seek history of activities performed on an item like edit, publish, workflow approval etc. Though we have options like Sitecore Log Analyzer, it requires developers’ assistance while it is critical to equip Content Administrators to function independently.

Sitecore Powershell comes with a super cool report, ‘Find Audit Trail from logs’ which crawls through the logs for the specified timeframe yielding all the audit logs (without exceptions etc. ;), Sitecore Log Analyzer will need to be used for analyzing exceptions). I haven’t seen much mention of this report and speaking to a few people it seems to have been missed/unnoticed. Here is how you can access the same,

Like any other Sitecore Powershell report, we do have the export and filter options available in this report as well, which makes our life easier.
We experienced one issue which was specifically due to an error in the format of certain log entries,

Few log entries in our instance were having a weird additional spacing before the thread id though the log4net conversionPattern etc. were correct. This appeared to the root cause for the above issue,

Updating the following line to remove empty entries fixed the issue,

By default Sitecore Powershell Reports are enabled only for Administrators. When it is needed for other roles, we will be needing to provide read access to /sitecore/content/Documents and settings/All users/Start menu/Right/Reporting Tools/PowerShell Reports item in core database.

If we are looking to provide access only for this specific report, then deny access will need to be added to other reports under /sitecore/system/Modules/PowerShell/Script Library/SPE/Reporting/Content Reports/Reports item in master database. It is important to note that the Content Author will be able to view history of activities performed on all the items including the ones for which they might not have access.

This Report was introduced in Powershell Extensions 5.0, if you are using an older version, you could use this package.

If you are using Sitecore PAAS with Azure Application Insights, you will be needing Application Insights App ID/Key and will be needing to fetch the logs using Azure Application Insights Queries. The below blog will be helpful for PAAS setup, https://www.sitecoregabe.com/2019/09/azure-application-insights-logs.html

References:
https://www.sitecoregabe.com/2018/08/basic-sitecore-audit-trail-with.html
https://github.com/SitecorePowerShell/Console/issues/1033

Efficient way to retrieve list of created/updated and deleted items during Publish

While we are familiar with publish:itemProcessed event that will be triggered once per processed item during publish, there might be instances where we might be looking to work on the entire list of published items all at once to accomplish the desired operations efficiently.

Sitecore has introduced ProcessedPublishingCandidates property (previously it was ProcessedItems which is obsolete now), which accumulates the processed items within the PublishItem pipeline. The PublishItem pipeline holds the responsibility of publishing a single item from the publish queue (this pipeline can also be intercepted with publish:itemProcessed, publish:itemProcessing events). The ProcessedPublishingCandidates property is accessible within the Publish pipeline as well (this pipeline manages the entire publish operation). This behavior enables us to retrieve the list of published items by intercepting the Publish Pipeline using ProcessedPublishingCandidates property.

Sitecore has also added DeleteCandidates property which
will be helpful if you are just looking to retrieve only the deleted items.

Here is the code snippet,

class CustomPublishProcessor : PublishProcessor
{
    public override void Process(PublishContext context)
    {
        Assert.ArgumentNotNull(context, "context");

        if (context.Aborted)
            return;
	
        //Fetches List of Processed(Created/Updated/Deleted) Items
        var processedItems = context.ProcessedPublishingCandidates.Keys
                .Select(i => context.PublishOptions.TargetDatabase.GetItem(i.ItemId)).Where(j => j != null);

        //Fetches List of Deleted Items
        var deletedItems = context.DeleteCandidates;
    }
}
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <pipelines>
      <publish>
        <processor patch:after="*[@type='Sitecore.Publishing.Pipelines.Publish.ProcessQueue, Sitecore.Kernel']" type="Assembly.Pipelines.CustomPublishProcessor, Assembly"/>
      </publish>
    </pipelines>
  </sitecore>
</configuration>

Please note that we do have another property ‘CustomData’ which is available within PublishContext, hence it is shared across all the pipeline steps. This allows us to store any custom data specific to a processed item within CustomData property during PublishItem pipeline and can then be utilized across other steps within PublishItem and Publish pipelines if needed.

SOLR Security Vulnerability CVE-2019-0192 (SOLR-13301) – Disabling Config API

Applying fixes for the Security Vulnerabilities is a critical activity needed for preventing any intrusions and for ensuring the security of the system. While it is vital to continuously monitor and apply the Security fixes released by the tools used in the implementation, it is also essential to review and deploy the previously released Security fixes for the versions of the opted tools during initial infrastructure setup.

This blog covers the options available to mitigate a SOLR Security Vulnerability, CVE-2019-0192 (SOLR-13301) released for versions 5.0–5.5.5, 6.0–6.6.5. When using Sitecore 9.0 or 8.2 with SOLR as Search Platform, the implementation might be running on one of the above mentioned SOLR versions. Options available to overcome this vulnerability can be found in Sitecore/SOLR Documentation,
https://kb.sitecore.net/articles/227897#note6 
https://issues.apache.org/jira/browse/SOLR-13301

While the most recommended option is to upgrade to the recent version of Sitecore and SOLR which will also allow to leverage the latest features, disabling Config API would work well when looking for an immediate solution.

The Config API enables manipulating various aspects of solrconfig.xml using REST-like API calls.
This feature is enabled by default and works similarly in both SolrCloud and standalone mode. Many commonly edited properties (such as cache sizes and commit settings) and request handler definitions can be changed with this API.

Config API can be disabled by adding System Property (disable.configEdit=true) to SOLR_OPTS environment variable defined in solr.in.cmd file located within SOLR bin folder. This can be achieved by adding the following line within solr.in.cmd file,

REM Disabling Config API for mitigating Security Vulnerability https://issues.apache.org/jira/browse/SOLR-13301
set SOLR_OPTS=%SOLR_OPTS% -Ddisable.configEdit=true

SOLR Service must be restarted for the above added System Property to take effect.

If you are using Powershell Script for installing SOLR, adding the following lines to the script should take care of it,

$SolrInCmd_Path = "S:\SOLR\solr-6.6.5\bin\solr.in.cmd"

##Disabling Config API for mitigating Security Vulnerability for v6.6.5 https://issues.apache.org/jira/browse/SOLR-13301
Add-Content -Path $SolrInCmd_Path -Value 'REM Disabling Config API for mitigating Security Vulnerability https://issues.apache.org/jira/browse/SOLR-13301'
Add-Content -Path $SolrInCmd_Path -Value 'set SOLR_OPTS=%SOLR_OPTS% -Ddisable.configEdit=true'

Following Curl Command can be used to ensure that the Config API is disabled. This should result in 403 Forbidden Error,

curl https://localhost:8983/solr/<core_name>/config -H "Accept: application/json" -H "Content-type:application/json" -d "{'set-user-property' : {'variable_name':'some_value'}}"

Alternatively, if Config API is being utilized in the implementation, applying SOLR-13301.patch and re-compiling SOLR or hardening Network Settings to allow only trusted traffic are viable options.

Extend Sitecore Experience to China

China being one of the top economies in the world, huge number of business entities are focusing to expand their services in China. Since websites are the face of the organization, it is highly crucial to have the websites perform better across different intended geographies for better user experience, but performance is a huge concern for websites in China when hosted outside of the country. While adding Content Delivery Network (CDN) usually improves the performance for different countries which are farther from the hosting servers, CDN setup process works little different for China.

There are a couple of major issues that make up to the slowness of the external websites in China,

Limited Peering Capacities – China has very limited internet providers with direct peering being  expensive. This makes the internet traffic exchange with the country highly congested and considered to be key reason for slowness.

The Great Firewall of China (GFW) – The country uses Deep Packet Inspection to monitor all the requests flow. This may also cause package loss, resulting in retransmission which slows the transaction even more.

In order to mitigate the above, to provide consistent load times in China, the easier route is to host the website in China. For hosting websites in China, a ICP (Internet Content Provider) License is mandatory which in turn requires the company to have a China business entity and a website with a Chinese domain registrar. This may also bring in the need of purchasing a new domain specifically for China region from Chinese domain registrar, probably .cn or .com.cn, based on the existing domain setup. Once the application for ICP is submitted and approved by Ministry of Industry and Information Technology (MIIT), an ICP Number is generated which is required to be placed in the website’s footer.

Few tasks including Infrastructure Setup can happen in parallel to the ICP Registration. There are a few options like Aliyun (Alibaba Cloud), Azure, AWS etc. for hosting the sites. Since AWS is one of the top providers in China and the current sites were hosted on AWS(IAAS), AWS China IAAS solution worked better for our situation. The platform selection requires some level of quality analysis based on existing infrastructure. Load Balancer should be considered for High Availability.

Here are few approaches that should be considered while hosting the website in China,

Setting up a New Website Infrastructure – Few businesses would prefer to run China websites separately from the existing infrastructure due to differences in website appearance, behaviour or content. This will introduce the need of setting up all required Application/Storage Roles and Environments.

Extending the Existing Website Infrastructure – Most of the businesses would like to retain their branding etc. which eliminates the need of setting up Content Management Role, Staging, UAT etc. unless it is required. Content Delivery and xDB/xConnect Application/Storage Roles will suffice the need.

Setting up CDN in China – Using China CDNs is another option which works well but introduces lot of challenges in leveraging different Experience Management capabilities of Sitecore. Please note that setting up CDNs in China will still require ICP Registration.

 

It is highly crucial to review the China Data Protection Regulations (CDPR) when there is a collection of personal information happening in any part of the website. Though there are various draft measures and sector-specific regulations on transfer of personal data outside borders of China, it is unclear at this point to what extent these rules apply and it is expected to be clearly defined this year. There is also a requirement to appoint a Data Protection Officer in China for large organizations depending on the quantity of data being collected/processed.

Though Google Chrome is the widely used Browser in China, there are few additional Browsers including Tencent’s QQ browser, Alibaba’s UC browser, Baidu, Sogou etc. which is being used by huge number of users. Hence it is important to make sure our websites are compatible with these additional browsers across different devices.

There are few Social Media including Facebook, Google+ etc. which are blocked in China, hence related integrations must be reviewed.

Though Google Analytics works in China, Baidu is highly preferred considering the following reasons,

  • Latency and Data loss issues when using Google Analytics
  • Google Analytics Web Interface being blocked in China

And there is no guarantee that Google Analytics won’t be blocked in future.

Despite all the above challenges and process, it is still worth doing. We did have our website performing about 70-80% faster in different regions of the country after hosting the sites in China.