API Reference

The scrap-ai v0.0.5 library provides a powerful yet straightforward API for asynchronous web scraping with customizable prompts and webhook callback support. This guide covers the primary classes, methods, and types available, helping you integrate web scraping into your projects with ease.

ScrapeClient

The core of the library is the ScrapeClient class, which handles the initiation of scraping tasks and verification of webhook callbacks.

Constructor

new ScrapeClient(apiKey: string)

Creates a new instance of the ScrapeClient.

Parameters:
  • apiKey: string

    Your unique API key for authentication.

Example:
import { ScrapeClient } from "scrap-ai";

const client = new ScrapeClient("your-api-key");

Methods

scrape(url: string, prompt: string, callbackUrl: string): Promise<void>

Initiates a scraping operation. When the operation is complete, the results are sent via a POST request to the specified callbackUrl.

Parameters:
  • url: string

    The URL of the webpage you want to scrape.

  • prompt: string

    Instructions that define what data to extract. Customize this prompt to target specific content on the page.

  • callbackUrl: string

    The endpoint where the scraped data will be delivered.

Example:

    await client.scrape(

    "https://example.com",
    "Extract all product titles and prices",
    "https://your-api.com/webhook"
    )
    

Callback Response Format

When your webhook endpoint receives a callback, the payload will follow this structure:

interface CallbackResponse {

status: "success" | "error";
data?: {
url: string;
results: any[];
timestamp: string;
};
error?: {
message: string;
code: string;
};
}
  • status: Indicates whether the scraping was successful or if an error occurred.

  • data: Present when the operation is successful. Contains:

    • url: The source URL that was scraped.
    • results: An array of extracted data.
    • timestamp: The time when the data was processed.
  • error: Present when the scraping fails, providing details about the error.

Webhook Verification

To ensure the authenticity and integrity of incoming webhook requests, the library provides a method to verify the webhook signature.

verifyWebhook(options: VerifyWebhookOptions): boolean

Validates a webhook request by comparing the provided signature against a generated HMAC.

Parameters:

options: VerifyWebhookOptions (object) containing:

  • body: string

    The stringified JSON payload from the request.

  • signature: string

    The HMAC SHA-256 signature from the header (x-webhook-signature).

  • timestamp: string | number

    The timestamp from the header (x-webhook-timestamp).

  • maxAge: number (optional)

    Maximum allowed age for the webhook (default is 300,000 milliseconds or 5 minutes).

Example:
const isValid = client.verifyWebhook({

body: JSON.stringify(req.body),
signature: req.headers["x-webhook-signature"],
timestamp: req.headers["x-webhook-timestamp"],
maxAge: 300000, // 5 minutes in milliseconds
});

if (!isValid) {
return res.status(401).json({ error: "Invalid webhook" });
}

Environment Variables

For the library to function properly, ensure you set the following environment variable in your deployment environment:

SCRAP_API_KEY

Your API key for authentication.

Additional Resources

This API reference serves as a comprehensive guide for utilizing the scrap-ai v0.0.5 library in your projects. For further assistance or inquiries, please reach out to our support team.