Skip to main content
GET
/
v0
/
scrape-jobs
/
{id}
Get a scrape job by ID
curl --request GET \
  --url https://api.avidoai.com/v0/scrape-jobs/{id} \
  --header 'x-api-key: <api-key>' \
  --header 'x-application-id: <api-key>'
{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "createdAt": "2024-01-05T12:34:56.789Z",
  "modifiedAt": "2024-01-05T12:34:56.789Z",
  "orgId": "org_123",
  "initiatedBy": "user_123",
  "name": "Documentation Scrape",
  "url": "https://example.com",
  "status": "PENDING",
  "applicationId": "456e4567-e89b-12d3-a456-426614174000",
  "pages": [
    {
      "url": "https://example.com/page1",
      "title": "Page 1",
      "description": "This is the first page of the documentation.",
      "category": "Documentation"
    },
    {
      "url": "https://example.com/page2"
    }
  ]
}

Authorizations

x-api-key
string
header
required

Your unique Avido API key

x-application-id
string
header
required

Your unique Avido Application ID

Path Parameters

id
string<uuid>
required

The unique identifier of the scrape job

Pattern: ^([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[1-8][0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}|00000000-0000-0000-0000-000000000000|ffffffff-ffff-ffff-ffff-ffffffffffff)$
Example:

"123e4567-e89b-12d3-a456-426614174000"

Response

Scrape job retrieved successfully

Response containing the scrape job details

id
string<uuid>
required

The unique identifier of the scrape job

Pattern: ^([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[1-8][0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}|00000000-0000-0000-0000-000000000000|ffffffff-ffff-ffff-ffff-ffffffffffff)$
Example:

"123e4567-e89b-12d3-a456-426614174000"

createdAt
string<date-time>
required

When the scrape job was created

Pattern: ^(?:(?:\d\d[2468][048]|\d\d[13579][26]|\d\d0[48]|[02468][048]00|[13579][26]00)-02-29|\d{4}-(?:(?:0[13578]|1[02])-(?:0[1-9]|[12]\d|3[01])|(?:0[469]|11)-(?:0[1-9]|[12]\d|30)|(?:02)-(?:0[1-9]|1\d|2[0-8])))T(?:(?:[01]\d|2[0-3]):[0-5]\d(?::[0-5]\d(?:\.\d+)?)?(?:Z))$
Example:

"2024-01-05T12:34:56.789Z"

modifiedAt
string<date-time>
required

When the scrape job was last modified

Pattern: ^(?:(?:\d\d[2468][048]|\d\d[13579][26]|\d\d0[48]|[02468][048]00|[13579][26]00)-02-29|\d{4}-(?:(?:0[13578]|1[02])-(?:0[1-9]|[12]\d|3[01])|(?:0[469]|11)-(?:0[1-9]|[12]\d|30)|(?:02)-(?:0[1-9]|1\d|2[0-8])))T(?:(?:[01]\d|2[0-3]):[0-5]\d(?::[0-5]\d(?:\.\d+)?)?(?:Z))$
Example:

"2024-01-05T12:34:56.789Z"

orgId
string
required

Organization ID that owns the scrape job

Example:

"org_123"

initiatedBy
string
required

User ID who initiated the scrape job

Example:

"user_123"

name
string
required

The name/title of the scrape job

Example:

"Documentation Scrape"

url
string<uri>
required

The URL that was scraped

Example:

"https://example.com"

status
enum<string>
required

Current status of the scrape job

Available options:
MAPPING,
PENDING,
IN_PROGRESS,
COMPLETED,
FAILED
Example:

"PENDING"

applicationId
string<uuid> · null · null

Optional application ID this scrape job belongs to

Pattern: ^([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[1-8][0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}|00000000-0000-0000-0000-000000000000|ffffffff-ffff-ffff-ffff-ffffffffffff)$
Example:

"456e4567-e89b-12d3-a456-426614174000"

pages
object[]

The pages scraped from the URL

Example:
[
{
"url": "https://example.com/page1",
"title": "Page 1",
"description": "This is the first page of the documentation.",
"category": "Documentation"
},
{ "url": "https://example.com/page2" }
]