Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 70 additions & 2 deletions docs/docs/usage-guide/configuration_options.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
The different tools and sub-tools used by PR-Agent are adjustable via a Git configuration file.
There are three main ways to set persistent configurations:
There are four main ways to set persistent configurations:

1. [Wiki](./configuration_options.md#wiki-configuration-file) configuration page
2. [Local](./configuration_options.md#local-configuration-file) configuration file
3. [Global](./configuration_options.md#global-configuration-file) configuration file
4. [External configuration URL](./configuration_options.md#external-configuration-url) (CLI flag)

In terms of precedence, wiki configurations will override local configurations, and local configurations will override global configurations.
In terms of precedence, wiki configurations will override local configurations, local configurations will override global configurations, and global configurations will override an external configuration URL.


For a list of all possible configurations, see the [configuration options](https://github.com/the-pr-agent/pr-agent/blob/main/pr_agent/settings/configuration.toml) page.
Expand Down Expand Up @@ -97,3 +98,70 @@ Repositories across your entire Bitbucket organization will inherit the configur

!!! note "Note"
If both organization-level and project-level global settings are defined, the project-level settings will take precedence over the organization-level configuration. Additionally, parameters from a repository’s local .pr_agent.toml file will always override both global settings.

## External configuration URL

`Platforms supported: GitHub, GitLab, Bitbucket, Azure DevOps`

When running PR-Agent from the CLI (or any wrapper that exposes its arguments), you can merge an additional `.pr_agent.toml` from any URL or local path before the repo-local and global configurations are applied. This is useful when:

- You want a single shared configuration that applies to repositories nested deep inside subgroups, where the [project/group-level lookup](./configuration_options.md#projectgroup-level-configuration-file) only walks one level up.
- The shared configuration is published outside of a Git host (a static site, an internal artifact server, an S3 bucket, etc.).
- You want CI-time control over which defaults are layered in, without committing a file to the target repository.

### Usage

Pass `--extra_config_url` to the CLI, or set the `PR_AGENT_EXTRA_CONFIG_URL` environment variable:

```bash
python -m pr_agent.cli \
--pr_url=<MR/PR URL> \
--extra_config_url=https://config.example.com/pr-agent/shared.toml \
review
```

Accepted values:

- `https://…` or `http://…` — fetched at runtime
- `file:///path/to/shared.toml` — read from the local filesystem
- A bare filesystem path — same as `file://`

### Authentication for private endpoints

For private endpoints (e.g. a GitLab API URL pointing at a private `pr-agent-settings` file), provide a single header via the `PR_AGENT_EXTRA_CONFIG_AUTH_HEADER` environment variable, formatted as `<HeaderName>: <value>`:

```bash
# GitLab Personal Access Token
export PR_AGENT_EXTRA_CONFIG_AUTH_HEADER="PRIVATE-TOKEN: <your-personal-access-token>"

# GitLab CI job token
export PR_AGENT_EXTRA_CONFIG_AUTH_HEADER="JOB-TOKEN: $CI_JOB_TOKEN"

# Generic bearer token
export PR_AGENT_EXTRA_CONFIG_AUTH_HEADER="Authorization: Bearer <your-token>"
```

### Precedence

External-URL settings are applied **first**, so every other layer overrides them:

```
built-in defaults
< --extra_config_url
< global pr-agent-settings
< local .pr_agent.toml (repo default branch)
< wiki .pr_agent.toml
< environment variables (PR_AGENT__SECTION__KEY)
```

This means an external URL acts as an organization-wide *default* that any team can still override with their own `pr-agent-settings` or repo-local `.pr_agent.toml`.

### Security and limits

The external file is loaded through the same secure loader as the repo-local `.pr_agent.toml`: includes, preloads, custom loaders, and other directives that could execute code or read arbitrary files are rejected. The fetcher additionally:

- Limits the response size to **1 MB**
- Uses a **10-second** request timeout
- Only accepts `http`, `https`, `file` schemes (or a bare local path)

If the fetch fails, the request is logged and PR-Agent continues with the remaining configuration layers.
14 changes: 14 additions & 0 deletions pr_agent/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,18 @@ def set_parser():
parser.add_argument('--version', action='version', version=f'pr-agent {get_version()}')
parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', default=None)
parser.add_argument('--issue_url', type=str, help='The URL of the Issue to review', default=None)
parser.add_argument(
"--extra_config_url",
type=str,
default=os.environ.get("PR_AGENT_EXTRA_CONFIG_URL"),
help=(
"URL or local path of an additional .pr_agent.toml to merge before the "
"repo-local config (e.g. shared/org defaults). Accepts http(s):// URLs or "
"a filesystem path. For private endpoints, set PR_AGENT_EXTRA_CONFIG_AUTH_HEADER "
"(e.g. 'PRIVATE-TOKEN: <token>' or 'JOB-TOKEN: $CI_JOB_TOKEN'). "
"Repo-local .pr_agent.toml overrides values set here."
),
)
Comment thread
qodo-free-for-open-source-projects[bot] marked this conversation as resolved.
parser.add_argument('command', type=str, help='The', choices=commands, default='review')
parser.add_argument('rest', nargs=argparse.REMAINDER, default=[])
return parser
Expand All @@ -76,6 +88,8 @@ def run(inargs=None, args=None):

command = args.command.lower()
get_settings().set("CONFIG.CLI_MODE", True)
if getattr(args, "extra_config_url", None):
get_settings().set("CONFIG.EXTRA_CONFIG_URL", args.extra_config_url)

async def inner():
if args.issue_url:
Expand Down
214 changes: 214 additions & 0 deletions pr_agent/git_providers/utils.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,233 @@
import copy
import os
import re
import tempfile
import tomllib
import traceback
from urllib.parse import urlparse
from urllib.request import Request, url2pathname, urlopen

from dynaconf import Dynaconf
from starlette_context import context

from pr_agent.config_loader import get_settings
from pr_agent.custom_merge_loader import validate_file_security
from pr_agent.git_providers import get_git_provider_with_context
from pr_agent.log import get_logger

_MAX_EXTRA_CONFIG_BYTES = 1 * 1024 * 1024 # 1 MB cap for a remote .toml
_FETCH_TIMEOUT_SECONDS = 10
# Bare Windows drive-letter paths (e.g. "C:\\shared.toml", "D:/cfg.toml").
# urlparse() would otherwise interpret the drive letter as a URL scheme.
_WINDOWS_DRIVE_PATH_RE = re.compile(r"^[A-Za-z]:[\\/]")


def _safe_url_for_log(url: str) -> str:
"""
Render a URL safe for logging: strip userinfo (user:pass@) and the query
string, both of which may carry credentials (e.g. ?private_token=...).
Falls back to a redacted placeholder on any parse error.
"""
try:
parsed = urlparse(url)
netloc = parsed.hostname or ''
if parsed.port:
netloc = f"{netloc}:{parsed.port}"
return f"{parsed.scheme}://{netloc}{parsed.path}"
except Exception:
return "<extra config URL redacted>"


def _resolve_extra_config_to_file(source):
"""
Resolve --extra_config_url to a local readable .toml file.

Accepts:
- http:// or https:// URL: fetched via urllib (with optional auth header
from PR_AGENT_EXTRA_CONFIG_AUTH_HEADER, e.g. "PRIVATE-TOKEN: <token>").
- file:// URL: treated as a local path.
- bare local path: used directly.

Returns (path, is_temp). Caller must remove path if is_temp is True.
Returns (None, False) if source can't be resolved.

Logs never include the raw URL — `_safe_url_for_log()` strips userinfo and
query string so embedded credentials don't leak into CI logs.
"""
# Validate / normalise the input at the boundary
if not isinstance(source, str):
get_logger().warning(
f"Ignoring CONFIG.EXTRA_CONFIG_URL: expected str, got {type(source).__name__}"
)
return None, False
source = source.strip()
if not source:
return None, False

# Bare Windows drive-letter paths must be handled before urlparse() — it
# would otherwise treat the drive letter as a URL scheme.
if _WINDOWS_DRIVE_PATH_RE.match(source):
if os.path.isfile(source):
return source, False
get_logger().warning(f"Extra config not found at local path: {source}")
return None, False

parsed = urlparse(source)
scheme = (parsed.scheme or "").lower()

# Local path (bare or file://)
if scheme in ("", "file"):
if scheme == "file":
# Preserve any non-localhost netloc (UNC-style file://host/share/...)
# and URL-decode percent-encoded path components via url2pathname.
netloc = parsed.netloc or ""
raw = parsed.path
if netloc and netloc.lower() != "localhost":
raw = f"//{netloc}{raw}"
local_path = url2pathname(raw)
else:
local_path = source
if os.path.isfile(local_path):
return local_path, False
get_logger().warning(f"Extra config not found at local path: {local_path}")
return None, False

if scheme not in ("http", "https"):
get_logger().warning(f"Unsupported scheme for extra config: {scheme}")
return None, False

# Fetch over HTTP(S)
safe_url = _safe_url_for_log(source)
headers = {"Accept": "text/plain, application/toml, */*"}
auth_header = os.environ.get("PR_AGENT_EXTRA_CONFIG_AUTH_HEADER")
if auth_header:
if ":" in auth_header:
name, value = auth_header.split(":", 1)
headers[name.strip()] = value.strip()
else:
# Surface misconfiguration instead of silently dropping the header.
get_logger().warning(
"PR_AGENT_EXTRA_CONFIG_AUTH_HEADER is set but malformed "
"(expected '<HeaderName>: <value>'); ignoring."
)

try:
req = Request(source, headers=headers, method="GET")
with urlopen(req, timeout=_FETCH_TIMEOUT_SECONDS) as resp:
data = resp.read(_MAX_EXTRA_CONFIG_BYTES + 1)
if len(data) > _MAX_EXTRA_CONFIG_BYTES:
get_logger().warning(
f"Extra config exceeds {_MAX_EXTRA_CONFIG_BYTES} bytes, skipping: {safe_url}"
)
return None, False
fd, tmp_path = tempfile.mkstemp(suffix=".toml")
with os.fdopen(fd, "wb") as f:
f.write(data)
get_logger().info(f"Fetched extra config from {safe_url} ({len(data)} bytes)")
return tmp_path, True
except Exception as e:
get_logger().warning(f"Failed to fetch extra config from {safe_url}: {e}")
return None, False


def _apply_settings_from_file(path: str, label: str):
"""
Merge an external .toml settings file into the global settings, section-by-section.
Uses the same custom_merge_loader as repo-local settings so security checks
(forbidden includes/preloads/loaders) apply consistently.
"""
if not path or not os.path.isfile(path):
return
try:
dynconf_kwargs = {
"core_loaders": [],
"loaders": ["pr_agent.custom_merge_loader"],
"merge_enabled": True,
}
try:
new_settings = Dynaconf(
settings_files=[path],
load_dotenv=False,
envvar_prefix=False,
**dynconf_kwargs,
)
except TypeError as e:
# Older Dynaconf versions don't accept load_dotenv / merge_enabled.
# The fallback Dynaconf(...) call below skips our custom_merge_loader,
# which is where validate_file_security() runs. Pre-validate the file
# explicitly here so forbidden directives (includes, preloads, custom
# loaders, ...) still cannot slip through on those older versions.
try:
with open(path, "rb") as f:
parsed_toml = tomllib.load(f)
validate_file_security(parsed_toml, path)
except Exception as sec_err:
get_logger().warning(
f"Extra config failed security pre-validation; skipping: {sec_err}"
)
return

get_logger().warning(
"Your Dynaconf version does not support disabled "
"'load_dotenv'/'merge_enabled' parameters. Loading extra config "
"after explicit security pre-validation; some Dynaconf-level "
"hardening flags are off. Please upgrade Dynaconf for better "
"security.",
artifact={"error": e, "traceback": traceback.format_exc()},
)
new_settings = Dynaconf(settings_files=[path])

merged_sections = []
for section, contents in new_settings.as_dict().items():
if not contents:
continue
section_dict = copy.deepcopy(get_settings().as_dict().get(section, {}))
for key, value in contents.items():
section_dict[key] = value
get_settings().unset(section)
get_settings().set(section, section_dict, merge=False)
Comment thread
qodo-free-for-open-source-projects[bot] marked this conversation as resolved.
merged_sections.append(section)
# Do NOT log the merged dict: external/repo .pr_agent.toml may contain
# secrets (e.g. openai.key, gitlab.personal_access_token) that would
# otherwise leak into CI logs. Section names are safe and sufficient
# for debugging which file contributed what.
get_logger().info(
f"Applied {label} settings from {path} (sections merged: {sorted(merged_sections)})"
)
except Exception as e:
Comment thread
qodo-free-for-open-source-projects[bot] marked this conversation as resolved.
get_logger().warning(f"Failed to apply {label} settings from {path}: {e}")


def apply_repo_settings(pr_url):
os.environ["AUTO_CAST_FOR_DYNACONF"] = "false"

# Apply external/shared config FIRST, before constructing the git provider:
# provider initialisers (e.g. GitLabProvider reads GITLAB.PERSONAL_ACCESS_TOKEN
# at __init__) need to see any provider-critical settings that come from the
# extra file. Repo-local .pr_agent.toml is still applied later and overrides
# the extra file on conflicting keys.
extra_source = get_settings().get("CONFIG.EXTRA_CONFIG_URL", None)
if isinstance(extra_source, str) and extra_source.strip():
extra_path, extra_is_temp = _resolve_extra_config_to_file(extra_source)
Comment thread
qodo-free-for-open-source-projects[bot] marked this conversation as resolved.
if extra_path:
try:
_apply_settings_from_file(extra_path, label="extra")
finally:
if extra_is_temp:
try:
os.remove(extra_path)
except Exception as e:
get_logger().error(
f"Failed to remove temp extra config {extra_path}: {e}"
)
elif extra_source is not None and not isinstance(extra_source, str):
get_logger().warning(
"Ignoring CONFIG.EXTRA_CONFIG_URL: expected str, got "
f"{type(extra_source).__name__}"
)

git_provider = get_git_provider_with_context(pr_url)

if get_settings().config.use_repo_settings_file:
repo_settings_file = None
try:
Expand Down
1 change: 1 addition & 0 deletions pr_agent/settings/configuration.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ log_level="DEBUG"
use_wiki_settings_file=true
use_repo_settings_file=true
use_global_settings_file=true
extra_config_url="" # optional URL or path to an additional .pr_agent.toml merged before the repo-local config; also settable via --extra_config_url or PR_AGENT_EXTRA_CONFIG_URL. See docs/docs/usage-guide/configuration_options.md#external-configuration-url.
disable_auto_feedback = false
ai_timeout=120 # 2 minutes
skip_keys = []
Expand Down
Loading
Loading