A reference guide for building AI agents: every method, how to authenticate, and the permissions each one needs.
The Hugging Face API is how an app or AI agent works with the Hugging Face Hub and the models on it: searching and reading models, datasets, and Spaces, creating and committing to a repository, and running a model for chat, embeddings, or image generation. Access is granted through a user access token, where a fine-grained token sets each scope to read or write on chosen repositories or organizations, and an agent is limited to what that token reaches. The Hub serves one continuously updated API, and it can push events to a webhook URL when a repository changes.
How an app or AI agent connects to Hugging Face determines what it can reach. There are several routes, one for working with the Hub and its repositories, one for running models, and one for receiving events, each governed by the access token behind it and the scopes that token carries.
The Hub API answers at huggingface.co under the /api path. It reads and writes models, datasets, and Spaces, creates and commits to repositories, and manages the account and organization data around them. It is a single, continuously updated API with no version to pin.
The inference router answers at router.huggingface.co and runs models across partner providers through one token. Its chat completions endpoint is OpenAI-compatible, so existing OpenAI client code can target it by swapping the base URL to the router's v1 path.
Webhooks deliver the chosen repository events to a receiver URL, and an optional secret on the X-Webhook-Secret header confirms each delivery came from Hugging Face. Webhooks are created and listed through the settings webhooks endpoints or the settings page.
Hugging Face's first-party MCP server at huggingface.co/mcp lets an agent search and explore models, datasets, Spaces, and papers, search the documentation, run Jobs, and call community Gradio Spaces as tools. It authenticates with a Hugging Face token and supports streamable HTTP and server-sent-events transports.
A fine-grained access token sets individual scopes, each read or write, on chosen repositories or a specific organization, such as repo.content.read on one model or inference.serverless.write for running models. It is the least-privilege choice and what Hugging Face recommends for production.
A read token grants read access to every repository the account can see, public and private, across the user and the organizations they belong to. It cannot write, which suits downloading models or running inference.
A write token adds write access to every repository the account can write, on top of read. It is coarse, all-or-nothing access, which suits a trusted local workflow more than a shared production agent.
The Hugging Face API is split into areas an agent can act on, such as models, datasets, Spaces, repository management, inference, and webhooks. Each area has its own methods and its own scopes, and some grant access to far more than others.
Search and list models, read a single model's details, and read its model tags.
Search and list datasets, read a single dataset's details, and read its dataset tags.
Search and list Spaces and read a single Space's details.
Create, move, rename, and delete repositories, update their visibility, list commits and references, read the file tree, and commit files.
List a repository's discussions and Pull Requests, create a discussion, and add a comment.
Run a model through the provider router for chat completions, feature extraction embeddings, and text-to-image generation, and list available models.
Read the authenticated user, and list and create collections.
List, read, create, update, and delete the account's webhooks.
Filter by method, access, or permission, or search any path. Select a row for version detail, rate limits, the related webhook event, and the source.
| Method | Endpoint | What it does | Access | Permission | Version | |
|---|---|---|---|---|---|---|
ModelsSearch and list models, read a single model's details, and read its model tags.3 | ||||||
| GET | /api/models | List and search models, filtered by search text, author, or tags, and sorted. | read | repo.content.read | Current | |
Public models are listed without a token. A read or fine-grained token with repo.content.read is needed to include private models the account can see. Paginated through the Link header, with search, author, filter, sort, limit, and full parameters. Acts onmodel Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/models/{repo_id} | Get all information for a single model, optionally at a specific revision. | read | repo.content.read | Current | |
Equivalent to model_info in the Python client. A revision can be appended as /revision/{revision}. Public models need no token; private ones need repo.content.read. Acts onmodel Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/models-tags-by-type | Get all the available model tags hosted on the Hub. | read | — | Current | |
A read-only catalogue of tags, returned without a token. Acts ontag Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
DatasetsSearch and list datasets, read a single dataset's details, and read its dataset tags.3 | ||||||
| GET | /api/datasets | List and search datasets, filtered by search text, author, or tags, and sorted. | read | repo.content.read | Current | |
Public datasets are listed without a token; repo.content.read includes private ones. Paginated through the Link header with the same search, author, filter, sort, limit, and full parameters as models. Acts ondataset Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/datasets/{repo_id} | Get all information for a single dataset, optionally at a specific revision. | read | repo.content.read | Current | |
Equivalent to dataset_info in the Python client. A revision can be appended as /revision/{revision}. Acts ondataset Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/datasets-tags-by-type | Get all the available dataset tags hosted on the Hub. | read | — | Current | |
A read-only catalogue of tags, returned without a token. Acts ontag Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
SpacesSearch and list Spaces and read a single Space's details.2 | ||||||
| GET | /api/spaces | List and search Spaces, filtered by search text, author, or tags, and sorted. | read | repo.content.read | Current | |
Public Spaces are listed without a token; repo.content.read includes private ones. Paginated through the Link header. Acts onspace Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/spaces/{repo_id} | Get all information for a single Space, optionally at a specific revision. | read | repo.content.read | Current | |
Equivalent to space_info in the Python client. A revision can be appended as /revision/{revision}. Acts onspace Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
Repository managementCreate, move, rename, and delete repositories, update their visibility, list commits and references, read the file tree, and commit files.9 | ||||||
| POST | /api/repos/create | Create a repository, a model by default, or a dataset or Space. | write | repo.write | Current | |
The body sets type, name, organization, private, and, for a Space, sdk. Needs repo.write on a fine-grained token, or a write token. Subject to a separate, undocumented repository-creation limit. Acts onrepository Permission (capability) repo.writeVersionAvailable since the API’s base version Webhook event repoRate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| DELETE | /api/repos/delete | Delete a repository, a model by default, or a dataset or Space. | write | repo.write | Current | |
The body sets type, name, and organization. Deleting a repository removes it and its files. Needs repo.write or a write token. Acts onrepository Permission (capability) repo.writeVersionAvailable since the API’s base version Webhook event repoRate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/repos/move | Move a repository: rename it or transfer it from a user to an organization. | write | repo.write | Current | |
The body sets fromRepo, toRepo, and type. Needs repo.write or a write token. Acts onrepository Permission (capability) repo.writeVersionAvailable since the API’s base version Webhook event repoRate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| PUT | /api/repos/{repo_type}/{repo_id}/settings | Update a repository's settings, such as its visibility. | write | repo.write | Current | |
Changing visibility to or from private is recorded as a repo.config update event. Needs repo.write or a write token. Acts onrepository Permission (capability) repo.writeVersionAvailable since the API’s base version Webhook event repo.configRate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/models/{namespace}/{repo}/commits/{rev} | List commits on a model repository at a given revision. | read | repo.content.read | Current | |
The same path shape exists under /api/datasets and /api/spaces. Public repositories need no token. Acts oncommit Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/models/{namespace}/{repo}/refs | List the references, branches and tags, of a model repository. | read | repo.content.read | Current | |
The same path shape exists under /api/datasets and /api/spaces. Public repositories need no token. Acts onreference Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/models/{namespace}/{repo}/tree/{rev}/{path} | List the files and folders of a model repository at a given revision and path. | read | repo.content.read | Current | |
The same path shape exists under /api/datasets and /api/spaces. Public repositories need no token. Acts onfile Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/models/{namespace}/{repo}/preupload/{rev} | Check the upload method for files before committing, deciding standard or Git LFS. | write | repo.write | Current | |
The first half of an upload: it tells the client whether content goes through Git LFS. Needs repo.write or a write token. Acts onfile Permission (capability) repo.writeVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/models/{namespace}/{repo}/commit/{rev} | Commit files to a model repository at a given revision. | write | repo.write | Current | |
Records the upload prepared by preupload as a commit, firing a repo.content update event. The same path shape exists under /api/datasets and /api/spaces. Subject to a separate, undocumented commit limit. Needs repo.write or a write token. Acts oncommit Permission (capability) repo.writeVersionAvailable since the API’s base version Webhook event repo.contentRate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
Discussions & Pull RequestsList a repository's discussions and Pull Requests, create a discussion, and add a comment.3 | ||||||
| GET | /api/{repoType}/{namespace}/{repo}/discussions | List the discussions and Pull Requests on a repository. | read | repo.content.read | Current | |
On the Hub, a Pull Request is a special type of discussion. Public repositories need no token. Acts ondiscussion Permission (capability) repo.content.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/{repoType}/{namespace}/{repo}/discussions | Create a new discussion or Pull Request on a repository. | write | discussion.write | Current | |
Fires a discussion create event. Needs discussion.write on a fine-grained token, or a write token. Subject to a separate, undocumented discussion limit. Acts ondiscussion Permission (capability) discussion.writeVersionAvailable since the API’s base version Webhook event discussionRate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/{repoType}/{namespace}/{repo}/discussions/{num}/comment | Add a comment to a discussion or Pull Request. | write | discussion.write | Current | |
Fires a discussion.comment create event. Needs discussion.write or a write token. Acts oncomment Permission (capability) discussion.writeVersionAvailable since the API’s base version Webhook event discussion.commentRate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
InferenceRun a model through the provider router for chat completions, feature extraction embeddings, and text-to-image generation, and list available models.4 | ||||||
| POST | /v1/chat/completions | Run a model for chat completions through the provider router, OpenAI-compatible. | write | inference.serverless.write | Current | |
Served at router.huggingface.co, not the Hub host. Drop-in OpenAI-compatible: swap the base URL. A model id can carry a provider or policy suffix such as :fastest or :cheapest. Needs inference.serverless.write on a fine-grained token. Acts oncompletion Permission (capability) inference.serverless.writeVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /v1/models | List the models available across providers, with pricing, context length, and throughput. | read | inference.serverless.write | Current | |
Served at router.huggingface.co. Lists models reachable through the router, including per-provider pricing and performance where available. Acts onmodel Permission (capability) inference.serverless.writeVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /models/{repo_id}/pipeline/feature-extraction | Run a model for feature extraction, returning embeddings for text. | write | inference.serverless.write | Current | |
Served through the inference router. Feature extraction returns embeddings for semantic search, retrieval, and recommendation. Needs inference.serverless.write. Acts onembedding Permission (capability) inference.serverless.writeVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /models/{repo_id}/pipeline/text-to-image | Run a model to generate an image from a text prompt. | write | inference.serverless.write | Current | |
Served through the inference router. Returns the generated image bytes. Needs inference.serverless.write. Acts onimage Permission (capability) inference.serverless.writeVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
Account & collectionsRead the authenticated user, and list and create collections.3 | ||||||
| GET | /api/whoami-v2 | Get information about the user or organization behind the token. | read | — | Current | |
Identifies the account the token belongs to and returns its orgs and the token's permissions. Any valid token works. Acts onuser Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/collections | List collections, filtered by owner, item, or search text. | read | collection.read | Current | |
Public collections are returned without a token; collection.read includes private ones. Acts oncollection Permission (capability) collection.readVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/collections | Create a collection of models, datasets, Spaces, or papers. | write | collection.write | Current | |
Needs collection.write on a fine-grained token, or a write token. Acts oncollection Permission (capability) collection.writeVersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
WebhooksList, read, create, update, and delete the account's webhooks.5 | ||||||
| GET | /api/settings/webhooks | List the webhooks configured on the account. | read | — | Current | |
Returns the account's webhooks, their watched repositories, and their target URLs. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/settings/webhooks | Create a webhook that delivers chosen repository events to a URL. | write | — | Current | |
Sets the watched repositories or namespaces, the target URL, and an optional secret sent back as the X-Webhook-Secret header. Each webhook is limited to 1,000 triggers per 24 hours. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| GET | /api/settings/webhooks/{webhookId} | Get a single webhook by id. | read | — | Current | |
Returns one webhook's watched repositories, target URL, and status. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| POST | /api/settings/webhooks/{webhookId} | Update a webhook's watched repositories, target URL, or secret. | write | — | Current | |
Changes which events are delivered and where. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
| DELETE | /api/settings/webhooks/{webhookId} | Delete a webhook by id. | write | — | Current | |
Stops all delivery for that webhook. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗ | ||||||
Hugging Face can notify an app or AI agent when something happens to a repository, instead of the app repeatedly asking. Hugging Face posts the event payload to a webhook URL that has been registered for the chosen repositories and events.
| Event | What it signals | Triggered by |
|---|---|---|
repo | Global events on a repository. The action is one of create, delete, update, or move, fired when a repository is created, deleted, renamed, or has its details change. | /api/repos/create/api/repos/delete/api/repos/move |
repo.content | Events on a repository's content, such as new commits or tags, including the commit created when a Pull Request opens. The action is always update, and the payload lists the references that changed. | /api/models/{namespace}/{repo}/commit/{rev} |
repo.config | Events on a repository's config, such as updating Space secrets, settings, or visibility. The action is always update, and the payload carries the updated config keys. | /api/repos/{repo_type}/{repo_id}/settings |
discussion | Creating a discussion or Pull Request, updating its title or status, or merging it. The action is one of create, delete, or update. | /api/{repoType}/{namespace}/{repo}/discussions |
discussion.comment | Creating, updating, or hiding a comment on a discussion or Pull Request. The action is one of create or update. | /api/{repoType}/{namespace}/{repo}/discussions/{num}/comment |
Hugging Face limits how fast an app or AI agent can call, through a request quota counted over a rolling five-minute window that depends on the account tier behind the token, with separate, higher quotas for downloading repository files.
Hugging Face counts requests in three buckets over a rolling five-minute window. The Hub APIs bucket covers calls like search, repository creation, and user management; the Resolvers bucket covers file downloads and carries a much higher quota; and the Pages bucket covers web pages. Quotas rise with the account tier behind the token: an anonymous caller gets about 500 Hub API requests per window per IP address, a free user 1,000, a PRO user 2,500, a Team organization 3,000, and Enterprise plans from 6,000 up to 100,000 when organization IP ranges are set. Going over returns a 429, and the RateLimit and RateLimit-Policy response headers report the remaining quota and the seconds until reset. Certain actions, such as repository creation, commits, and discussions, carry their own separate, undocumented limits. Each webhook is capped at 1,000 triggers per 24 hours.
List endpoints, such as listing models, datasets, or Spaces, are paginated and return a Link header with a rel="next" URL, which should be followed rather than built by hand, until it is absent. A limit parameter caps the number of results fetched, and a full parameter requests the fuller record for each item. The Python client follows the Link header automatically.
Hub API requests and responses are JSON. Large files are not sent inline: an upload first calls the preupload check to decide whether content goes through Git LFS or the standard path, then a commit call records it, so file size is handled by the storage layer rather than a single request body limit. Listing endpoints return trimmed records by default and the fuller record only when full is requested.
The status codes an agent should handle, and what to do about each.
| Status | Code | Meaning | What to do |
|---|---|---|---|
| 401 | Unauthorized | Authentication is missing, or the access token is invalid or has been deleted. | Send a valid token in the Authorization header as 'Bearer |
| 403 | Forbidden | The token is valid but lacks the scope for this call, or an organization has denied, revoked, or restricted it. A read or write token used where the organization requires a fine-grained token is also rejected here. | Grant the missing scope, or have an organization administrator approve the token. |
| 404 | Not Found | The repository does not exist, or the token cannot see a private repository. A private repository is returned as 404 rather than 403 so that its existence is not confirmed. | Confirm the repository id and that the token has access to it. |
| 429 | Too Many Requests | A rate limit was exceeded for the current five-minute window. The RateLimit and RateLimit-Policy response headers report the remaining quota and the seconds until it resets. | Wait until the window resets, spread requests out, pass a token, or upgrade the account tier. |
The Hub API is served under a single, continuously updated version. There is no dated version to pin; changes ship through dated release notes rather than versioned endpoints.
The Hub API is served under a single, continuously updated version with no dated version header to pin. The machine-readable reference moved to an OpenAPI specification published at a well-known path and an OpenAPI Playground in late 2025. Inference Providers exposes an OpenAI-compatible router under a v1 path for chat completions, so existing OpenAI client code can target Hugging Face by swapping the base URL.
Hugging Face launched a first-party MCP server at huggingface.co/mcp, letting an AI assistant search and explore models, datasets, Spaces, and papers, search the documentation, run Jobs, and call community Gradio Spaces as tools, authenticated with a Hugging Face token.
Hugging Face launched Inference Providers, a unified router that runs hundreds of models across partner providers through one Hugging Face token, with an OpenAI-compatible chat completions endpoint and native Python and JavaScript clients.
Because the API is unversioned, an integration tracks behaviour through the release notes rather than pinning a version.
Hugging Face changelog ↗Bollard AI sits between a team's AI agents and Hugging Face. Grant each agent exactly the access it needs, read or write, area by area, and every call is checked and logged.