Skip to content

Analyze Images & Videos

POST https://api.visionati.com/api/fetch

GET requests with query parameters are also supported, but POST with a JSON body is recommended.

ParameterRequiredDefaultDescription
urlNoURL of the image or video to analyze. Pass a string for one, or an array for multiple. Videos and multiple URLs return an async response.
fileNoBase64 value or Data URI of an image/video file. Pass a string for one, or an array for multiple.
file_nameNoUploaded file name. String or array.
backendNoall defaultsAI backend(s) to use. String or array. See AI Backends for valid values.
featureNoallFeature(s) to enable. String or array. See Features for valid values.
roleNogeneralDescription persona. See Roles for valid values.
languageNoEnglishOutput language for descriptions. See Supported Languages. If unsupported by the upstream provider, results are returned in English.
promptNoCustom text prompt for descriptions. Overrides role.
tag_scoreNo0.9Minimum tag confidence score. Float between 0 and 1.
capture_intervalNo1Seconds between video frame captures. Integer greater than 0.
max_framesNo3Maximum video frames to process. Set to 0 for no limit.
  • File size: 20MB maximum per file
  • URL length: 2048 characters maximum for the request query string
Terminal window
curl -X POST "https://api.visionati.com/api/fetch" \
-H "X-API-Key: Token YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/photo.jpg"}'
{
"request_id": "5ec7ecc6-8a61-417b-b7bb-1aa00b9ed7a7",
"user_id": 1,
"urls": ["https://example.com/photo.jpg"],
"files": 0,
"file_names": [],
"features": [],
"backends": [],
"role": "general",
"tag_score": 0.9,
"capture_interval": 0,
"max_frames": 3,
"credits_paid": 3,
"credits": 1209,
"all": {
"assets": [
{
"name": "https://example.com/photo.jpg",
"tags": {
"sculpture": [
{ "name": "sculpture", "score": 0.9821, "source": "clarifai" },
{ "name": "sculpture", "score": 0.8460, "source": "googlevision" }
],
"temple": [
{ "name": "temple", "score": 0.9730, "source": "clarifai" },
{ "name": "temple", "score": 0.7803, "source": "googlevision" }
]
},
"colors": {
"#9c735a": [
{
"hex": "#9c735a",
"score": 0.4203,
"pixel_fraction": 0.0764,
"red": 156,
"green": 115,
"blue": 90,
"source": "googlevision"
}
]
},
"nsfw": [
{ "label": "sfw", "score": 0.9802, "source": "clarifai" },
{ "label": "nsfw", "score": 0.0197, "source": "clarifai" },
{ "label": "adult", "likelihood": "UNLIKELY", "source": "googlevision" },
{ "label": "violence", "likelihood": "UNLIKELY", "source": "googlevision" }
],
"descriptions": [
{
"description": "A serene Buddha statue seated in the lotus position within an intricately carved wooden mandorla...",
"source": "openai"
}
]
}
],
"errors": []
}
}
FieldDescription
request_idUnique identifier for this request
user_idYour user ID
urlsThe URLs you submitted
filesNumber of files submitted
featuresFeatures you requested (empty means all)
backendsBackends you requested (empty means all defaults)
roleThe role used for descriptions
tag_scoreThe minimum tag score filter applied
credits_paidCredits charged for this request
creditsYour remaining credit balance
all.assetsArray of analyzed images/videos
all.errorsAny errors from individual backends

Each asset in all.assets contains:

FieldDescription
nameThe URL or filename of the analyzed image
tagsObject of detected tags grouped by name, each with score and source
colorsDominant colors with hex values, RGB, scores, and pixel fractions
nsfwContent moderation scores from each detection backend
descriptionsAI-generated descriptions from each LLM backend
facesDetected faces with emotions, age, and gender (if faces feature enabled)
textsOCR-extracted text with bounding polygons (if texts feature enabled)
brandsDetected logos/brands with confidence scores (if brands feature enabled)

When you send multiple URLs or files, the API returns an async response with a response_uri to poll for results.

Terminal window
curl -X POST "https://api.visionati.com/api/fetch" \
-H "X-API-Key: Token YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": ["https://example.com/photo1.jpg", "https://example.com/photo2.jpg"]}'
{
"request_id": "267f99ce-c797-4855-807f-21b204edb7ed",
"user_id": 1,
"urls": ["https://example.com/photo1.jpg", "https://example.com/photo2.jpg"],
"files": 0,
"file_names": [],
"features": [],
"backends": [],
"role": "general",
"tag_score": 0.9,
"capture_interval": 0,
"max_frames": 3,
"success": true,
"response_uri": "https://api.visionati.com/api/response/267f99ce-c797-4855-807f-21b204edb7ed"
}

Use the response_uri to poll for results. See Async Responses for details.

Videos are always processed asynchronously. The API extracts frames at the specified capture_interval and analyzes each frame as a separate image.

See Video Processing for details on how video analysis works.