Skip to content

git scan repository field contains local file:// path instead of upstream remote #4876

@jamesgol

Description

@jamesgol

TruffleHog Version

v3.94.3 (47e7b7c)

Trace Output

N/A - no error occurs

Expected Behavior

When scanning a local clone of a repository that has a remote configured (e.g. origin pointing to https://dev.azure.com/org/project/_git/repo), the SourceMetadata.Git.Repository field in JSON output should contain the real upstream remote URL:

"repository": "https://dev.azure.com/org/project/_git/repo"

Actual Behavior

The repository field contains the local filesystem path (as a file:// URI) instead of the upstream remote URL:

"repository": "file:///tmp/RepoName.nyWepr"

This happens because TruffleHog re-clones the target repository to a temporary directory. The clone's origin remote is set to the file:// path of the local repo that was passed in, rather than the real upstream remote configured on that repo.

The root cause is in GetSafeRemoteURL (pkg/sources/git/git.go), which reads the remote URL from the cloned copy. When the clone's origin is a local path, it does not follow through to the original repository to resolve the actual upstream remote.

Steps to Reproduce

  1. Clone a repository from a remote (e.g. Azure DevOps, GitHub):
    git clone https://dev.azure.com/org/project/_git/repo /tmp/my-local-clone
  2. Verify the remote is set:
    git -C /tmp/my-local-clone remote -v
    # origin  https://dev.azure.com/org/project/_git/repo (fetch)
    # origin  https://dev.azure.com/org/project/_git/repo (push)
  3. Scan the local clone with TruffleHog:
    trufflehog git file:///tmp/my-local-clone --json
  4. Observe that the repository field in the JSON output contains file:///tmp/my-local-clone instead of https://dev.azure.com/org/project/_git/repo.

Environment

  • OS: Linux (also reproducible on macOS/Darwin 25.3.0)
  • Version: v3.94.3

Additional Context

This affects any workflow that scans pre-cloned local repositories and relies on the repository field for identifying the source — common when integrating TruffleHog into CI/CD pipelines that batch-scan many repos from Azure DevOps, GitHub Enterprise, etc.

The issue is in ScanCommits (line ~697) and ScanStaged (line ~999), both of which call GetSafeRemoteURL(repo, "origin") on the temporary clone rather than resolving the original repository's upstream remote.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions