Skip to main content

Setting up private backups for Sanity using Azure Functions

Content Management

Realize

Davit Hakobyan CX Technical Consultant

Davit Hakobyan

CX Technical Consultant

With Content Lake, Sanity offers state-of-the-art storage for highly structured content. Today, content is considered a strategic asset for many enterprises, so let's find out how you can safeguard it by setting up your own private backup system.

The guide below provides step-by-step instructions on how to set this up.

Setting Up the Azure Function

This is the core code for the Azure Function. Let’s walk through it section by section.

require("dotenv").config();
const { createClient } = require("@sanity/client");
const exportDataset = require("@sanity/export");
const fs = require("node:fs");
const {
  ShareServiceClient,
  StorageSharedKeyCredential,
} = require("@azure/storage-file-share");
const { app } = require("@azure/functions");

First, we load the necessary dependencies, including the Sanity client, Sanity Export, Azure Storage libraries and Azure Functions.

We also use Dotenv to load our environment variables.

Defining the Backup Function

The main function is defined under a daily timer trigger in Azure Functions:

app.timer("dailyBackup", {
  schedule: "0 0 0 * * *", // Daily at midnight
  handler: async (myTimer, context) => {
    const DATASET = process.env.SANITY_STUDIO_DATASET;

    const sanityClient = createClient({
      projectId: process.env.SANITY_STUDIO_PROJECT_ID,
      dataset: DATASET,
      useCdn: false,
      apiVersion: "2023-05-03",
      token: process.env.SANITY_API_READ_TOKEN,
    });

This section sets up the sanityClient with the correct credentials, dataset, and API version. Azure Functions are configured to execute every day at midnight, triggered by a cron schedule.

Exporting the Dataset

The backup function leverages Sanity’s export package to create a zip (tar.gz) file containing the dataset:

async function backup() {
  await exportDataset({
    client: sanityClient,
    dataset: DATASET,
    outputPath: `/tmp/${DATASET}.tar.gz`,
    assetConcurrency: 12,
  });
}

This creates a compressed file of the Sanity dataset in the temporary storage area (/tmp/) on Azure.

Uploading to Azure Storage

Next, the code uploads the backup file to Azure Storage:

const shareName = "shareName";
const directoryName = "backups";
const account = "account";
const accountKey = process.env.AZURE_FILES_ACCOUNT_KEY;

const credential = new StorageSharedKeyCredential(account, accountKey);
const serviceClient = new ShareServiceClient(
  `https://${account}.file.core.windows.net`,
  credential
);

Here, StorageSharedKeyCredential authenticates the Azure Storage client. The client uploads the zip file using a file stream and divides it into manageable chunks.

Automating Backup Cleanup

To avoid storage overflow, we include a function to delete old backups, retaining only the latest 31 files:

async function deleteOldestZips() {
  const directoryClient = serviceClient
    .getShareClient(shareName)
    .getDirectoryClient(directoryName);
  
  let zipFiles = [];
  for await (const item of directoryClient.listFilesAndDirectories()) {
    if (item.kind === "file" && item.name.endsWith(".gz")) {
      const fileClient = directoryClient.getFileClient(item.name);
      const properties = await fileClient.getProperties();
      zipFiles.push({ name: item.name, lastModified: properties.lastModified });
    }
  }

  zipFiles.sort((a, b) => a.lastModified - b.lastModified);
  const filesToDelete = zipFiles.length > 31 ? zipFiles.slice(0, zipFiles.length - 31) : [];

  for (const file of filesToDelete) {
    await directoryClient.getFileClient(file.name).delete();
  }
}

This function ensures only the 31 most recent backup files are retained by sorting the files by modification date and removing the oldest ones.

Testing and Deployment

With the function code ready, we deploy it to Azure Functions, where it now runs automatically each day at midnight. From this moment on, Sanity content backups are reliably stored in Azure, with unnecessary files automatically removed.

Multi-Layered Approach to Data Protection

When it comes to safeguarding critical data, redundancy is key. We strongly recommend enabling backups for your Azure file share as an additional layer of protection against data loss.

The current backup configuration follows a robust retention strategy:

  • Daily backups: Created every day and retained for a week to address short-term recovery needs.
  • Weekly backups: Generated once a week and stored for a month, providing coverage for medium-term requirements.
  • Monthly backups: Conducted once a month and retained for a year to handle long-term data restoration scenarios.

This tiered approach ensures you have multiple recovery points, accommodating a variety of scenarios ranging from accidental deletions to extended outages.

For detailed guidance on how to enable and configure backups for your Azure file share, refer to the official documentation here. Taking these steps will significantly enhance the resilience and recoverability of your data.

Final Thoughts

Using the approach discussed, you can leverage Azure Functions to create a private backup solution for your structured content stored in Sanity. This approach provides scalability and customizability, meeting your unique business requirements. With Azure's robust cloud infrastructure, you can have the confidence that your data is securely stored and readily available.

This solution serves as a reliable solution for businesses using Sanity, allowing them to maintain full control over their backup processes.

Explore the full implementation to see how this setup could benefit your organization! The source code is available on Github.

Don’t hesitate to reach out if you have any questions or feedback!