Locally-configured sources

Adding a Source configuration

Locally-configured sources are configured in your config.yaml file under the sources field.

Example config

concurrency: "8"
filterUnverified: true
logLevel: info
numWorkers: 16
sources:
- connection:
    '@type': type.googleapis.com/sources.Confluence
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: [email protected]
    endpoint: https://ourbusiness.atlassian.net/wiki
  name: Confluence
  scanInterval: 43200s
  type: SOURCE_TYPE_CONFLUENCE
  verify: true
trufflehogAddress: https://gnarly-flying-pancake.c1.prod.trufflehog.org
trufflehogScannerGroup: account 1 - us-west-2
trufflehogScannerToken: thog-agent-XXXXXXXXXXXXXXXXXXXXXXXXXX

Config Definitions:

RunOnce: This field determines the execution mode of the scanner. If set to true, the sources specified will be scanned only once, after which the scanner will terminate. This implies that the “ScanInterval” value associated with each source will be ignored, and periodic scanning will not take place.

Concurrency: This value sets the number of sources that can be scanned at the same time. It essentially controls the parallelism level of the scanning process. If this value is not explicitly set, it defaults to the number of cores available on the machine running the scanner, maximizing hardware utilization.

NumWorkers: This field specifies the number of workers allocated for the detection process. Each worker is an independent unit that can process a portion of the task in parallel with other workers. If left unspecified, the default value is the number of cores present on the machine, meaning each core would have one worker assigned to it.

FilterUnverified: This field acts as a filter on the scanner’s output. If set to true, the scanner will limit its output for unverified results. Specifically, if a chunk of data yields more than one unverified result from a detector, only the first result will be included in the output. This reduces the noise in the case of multiple unverified detections. This filtering does not apply to verified results, which will be outputted normally. By reducing the volume of unverified detections reported, this setting can help focus attention on verified findings.

Artifactory

Artifactory with Access Token

It is recommended to generate an access token for a user with read-only permissions. To do so, create a new user in the JFrog Artifactory UI under “Identity and Access.” Leave all roles unchecked and ensure the user is added to the readers group (selected by default). Once created, navigate to the “Access Tokens” tab and generate a token for the newly created user.

sources:
- connection:
    '@type': type.googleapis.com/sources.Artifactory
    accessToken: access_token
    endpoint: https://example.jfrog.io
    repositories:
    - repo1
    - repo2
  name: Artifactory repository artifacts
  scanInterval: 43200s
  type: SOURCE_TYPE_JFROG_ARTIFACTORY
  verify: true

Artifactory with Basic Authentication

Alternatively, basic authentication can be used.

sources:
- connection:
    '@type': type.googleapis.com/sources.Artifactory
    basicAuth:
      password: secret
      username: username
    endpoint: https://example.jfrog.io
    repositories:
    - repo1
    - repo2
  name: Artifactory repository artifacts
  scanInterval: 43200s
  type: SOURCE_TYPE_JFROG_ARTIFACTORY
  verify: true

password may be one of:

  • Access token
  • Account password
  • API key

Azure Repos

Azure Repos can currently be scanned using a personal access token (PAT). To create a PAT, follow these steps:

  1. Go to your Azure DevOps account and click on the “User Settings” icon in the top right corner next to your profile picture.
  2. Click on “Personal access tokens”.
  3. Click on “New Token”.
  4. Enter a name for the token, select an organization and select the “Custom defined” option. Then, select the “Code (read)” scope. Please make sure that the “All accessible organizations” option is selected if you want to scan all repositories from all organizations.
  5. Click on “Create”.

When providing organizations, projects and repositories in the config, please take note of the following:

  • At least one organization is required.
  • Hierarchy: organizations > projects > repositories. Ensure projects are from specified organizations, and repositories are from specified projects.
  • Specifying only “organizations” will result in scanning all their projects. Specifying only “projects” will scan all their repositories.
  • The “ignore” filter always overrides the “include” filter, applicable to both “projects” and “repositories”.

Note: OAuth and basic authentication will be supported in the future.

sources:
- connection:
    '@type': type.googleapis.com/sources.AzureRepos
    ignoreProjects:
    - Project2
    ignoreRepos:
    - https://dev.azure.com/trufflescurity/IgnoreRepo
    includeForks: true
    organizations:
    - trufflesecurity
    projects:
    - Project1
    repositories:
    - https://dev.azure.com/trufflescurity/RepoRascal
    token: XXXXXXXXXXXXXXXXXXXXXXXXXX
  name: Azure Repos
  scanInterval: 43200s
  type: SOURCE_TYPE_AZURE_REPOS
  verify: true

BitBucket

You have three options depending on your BitBucket instance and the authentication method you prefer:

1. BitBucket Server with Basic Authentication

In this method, you can use an App Password or a token for authentication. Follow these steps:

sources:
- connection:
    '@type': type.googleapis.com/sources.Bitbucket
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: scanner-account
    endpoint: https://bitbucket.ourbusiness.com
    ignoreRepos:
    - https://bitbucket.ourbusiness.com/linux-kernel/ignore.git
    - https://bitbucket.ourbusiness.com/torvalds/ignore2.git
    repositories:
    - https://bitbucket.ourbusiness.com/linux-kernel/linux.git
    - https://bitbucket.ourbusiness.com/torvalds/linux.git
  name: BitBucket Server
  scanInterval: 43200s
  type: SOURCE_TYPE_BITBUCKET
  verify: true

Please note:

If you use an App Password, ensure that it has Read access for both the Account and Repositories. If you use a token, it can be used in place of the password.

2. BitBucket Cloud with Personal Access Token (PAT)

For cloud-hosted BitBucket instances, you can also use a Personal Access Token (PAT). Follow these steps:

sources:
- connection:
    '@type': type.googleapis.com/sources.Bitbucket
    endpoint: https://bitbucket.org/myworkspace
    ignoreRepos:
    - https://bitbucket.ourbusiness.com/linux-kernel/ignore.git
    - https://bitbucket.ourbusiness.com/torvalds/ignore2.git
    repositories:
    - https://bitbucket.ourbusiness.com/linux-kernel/linux.git
    - https://bitbucket.ourbusiness.com/torvalds/linux.git
    token: ATCTTxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  name: BitBucket Cloud Token Auth
  scanInterval: 43200s
  type: SOURCE_TYPE_BITBUCKET
  verify: true

BitBucket provides three types of access tokens. Among them, we recommend using the “Workspace Access Token” as it provides access to all projects and repositories.

Please note:

If you specify both the “repositories” and “ignoreRepos” fields, the application will prioritize the “repositories” field. To avoid confusion, we recommend specifying only one of these fields.

3. BitBucket Cloud with Basic Authentication

If you’re using a cloud-hosted BitBucket instance, you can use basic authentication. Follow these steps:

sources:
- connection:
    '@type': type.googleapis.com/sources.Bitbucket
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: scanner-account
    endpoint: https://bitbucket.org/myworkspace
    ignoreRepos:
    - https://bitbucket.ourbusiness.com/linux-kernel/ignore.git
    - https://bitbucket.ourbusiness.com/torvalds/ignore2.git
    repositories:
    - https://bitbucket.ourbusiness.com/linux-kernel/linux.git
    - https://bitbucket.ourbusiness.com/torvalds/linux.git
  name: BitBucket Cloud Basic Auth
  scanInterval: 43200s
  type: SOURCE_TYPE_BITBUCKET
  verify: true

Please note:

The password must have Read access for both the Account and Repositories. Tokens CANNOT be used in place of the password for BitBucket Cloud.

Buildkite

Your API Access Token must have GraphQL API access enabled along with the following REST API Scopes: Organization Access, Read Artifacts, Read Builds, Read Build Logs, and Read Pipelines.

sources:
- connection:
    '@type': type.googleapis.com/sources.Buildkite
    token: XXXXXXXXXXXXXXXXXXXXXXXXXX
  name: Buildkite logs and artifacts
  scanInterval: 43200s
  type: SOURCE_TYPE_BUILDKITE
  verify: true

Confluence

Basic authentication with an email address for the username and a Confluence cloud token for the password must be configured using basic authentication for Confluence Cloud. Creating a token on Atlassian Cloud

For on-premise Confluence instances, you can use a username and password with basic authentication, or you can use a personal access token (PAT) with token authentication.

Confluence with basic authentication

sources:
- connection:
    '@type': type.googleapis.com/sources.Confluence
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: [email protected]
    endpoint: https://ourbusiness.atlassian.net/wiki
    ignoreSpaces:
    - Space1
    includeAttachments: true
    skipHistory: true
    spaces:
    - Space2
    - Random-Space
  name: Confluence
  scanInterval: 43200s
  type: SOURCE_TYPE_CONFLUENCE
  verify: true

Confluence with personal access token (PAT)

sources:
- connection:
    '@type': type.googleapis.com/sources.Confluence
    endpoint: https://ourbusiness.atlassian.net/wiki
    ignoreSpaces:
    - Space2
    includeAttachments: true
    skipHistory: true
    spaces:
    - Space1
    - Random-Space
    token: XXXXXXXXXXXXXXXXXXXXXXXXXX
  name: Confluence
  scanInterval: 43200s
  type: SOURCE_TYPE_CONFLUENCE
  verify: true

Docker

The Docker integration supports using unauthenticated scans, the docker keychain (docker login), bearer token, and basic authentication. If you provide images without a tag, then latest will be assumed.

Docker with no authentication

sources:
- connection:
    '@type': type.googleapis.com/sources.Docker
    images:
    - trufflesecurity/secrets
    unauthenticated: {}
  name: Docker
  scanInterval: 43200s
  type: SOURCE_TYPE_DOCKER
  verify: true

Docker with Docker keychain authentication

sources:
- connection:
    '@type': type.googleapis.com/sources.Docker
    dockerKeychain: true
    images:
    - trufflesecurity/secrets
  name: Docker
  scanInterval: 43200s
  type: SOURCE_TYPE_DOCKER
  verify: true

Docker with basic authentication

sources:
- connection:
    '@type': type.googleapis.com/sources.Docker
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: user
    images:
    - trufflesecurity/secrets
  name: Docker
  scanInterval: 43200s
  type: SOURCE_TYPE_DOCKER
  verify: true

Docker with bearer token authentication

sources:
- connection:
    '@type': type.googleapis.com/sources.Docker
    bearerToken: token-value
    images:
    - trufflesecurity/secrets
  name: Docker
  scanInterval: 43200s
  type: SOURCE_TYPE_DOCKER
  verify: true

Filesystem

sources:
- connection:
    '@type': type.googleapis.com/sources.Filesystem
    directories:
    - /home/me/dev
  name: Filesystem
  scanInterval: 43200s
  type: SOURCE_TYPE_FILESYSTEM
  verify: true

File and Stdin

Help

$ trufflehog file --help         
usage: TruffleHog file [<flags>] [<path>]

Scan a file (defaults to standard in)

Flags:
      --help                  Show context-sensitive help (also try --help-long and --help-man).
  -v, --debug                 Enable debug mode.
      --trace                 Enable tracing of code line numbers.
      --json                  Enable JSON output.
      --send-error-telemetry  Turns error telemetry off.
      --quiet                 Only show results.

Args:
  [<path>]  Path of the file to scan

Example

You will need to obtain credentials to run this. You can get them by creating a scanner group (on your isolated instance go to settings -> scanners) and downloading the config.

Tip: run with --no-update if doing frequent invocations to cut down on startup time by ignoring updates

#  3 different ways you can invoke stdin and file scanner

./trufflehog file --config config.yaml --json /etc/passwd

cat /etc/password | ./trufflehog file --config config.yaml --json

./trufflehog file --config config.yaml --json < /etc/password

When using Docker, you must include the --interactive or -i flag (but not -t or --tty) for Docker to past the stdin to TruffleHog:

docker run --net=host --restart=unless-stopped -v $(pwd)/config.yaml:/tmp/config.yaml -i us-docker.pkg.dev/thog-artifacts/public/scanner:latest file --config=/tmp/config.yaml

GCS (Google Cloud Storage)

ProjectID is required. If you omit providing buckets then all buckets that the credential can list and access will be scanned.

When using the include/exclude filters for both buckets and objects, the include filters take precedence if both are specified. It is recommended to only use one of the two filters for each.

Example IAM policy:

{
  "version": "1",
  "bindings": [
    {
      "role": "roles/storage.objectViewer",
      "members": [
        "user:<user_email>"
      ]
    },
    {
      "role": "roles/viewer",
      "members": [
        "user:<user_email>"
      ]
    }
  ]
}

Configuration:

sources:
- connection:
    '@type': type.googleapis.com/sources.GCS
    adc: {}
    excludeBuckets:
    - bucket3
    excludeObjects:
    - object3
    includeBuckets:
    - bucket1
    - bucket2
    includeObjects:
    - object1
    - object2
    projectId: my-project (REQUIRED)
  name: GCS
  scanInterval: 43200s
  type: SOURCE_TYPE_GCS
  verify: true

GCS with service account file. (JSON)

Example IAM policy:

{
  "version": "1",
  "bindings": [
    {
      "role": "roles/storage.objectViewer",
      "members": [
        "user:<user_email>"
      ]
    },
    {
      "role": "roles/viewer",
      "members": [
        "user:<user_email>"
      ]
    }
  ]
}

Configuration:

sources:
- connection:
    '@type': type.googleapis.com/sources.GCS
    excludeBuckets:
    - bucket3
    excludeObjects:
    - object3
    includeBuckets:
    - bucket1
    - bucket2
    includeObjects:
    - object1
    - object2
    projectId: my-project (REQUIRED)
    serviceAccountFile: /path/to/service-account.json
  name: GCS
  scanInterval: 43200s
  type: SOURCE_TYPE_GCS
  verify: true

GCS without authentication.

Can only be used for public buckets.

Configuration:

sources:
- connection:
    '@type': type.googleapis.com/sources.GCS
    excludeObjects:
    - object3
    includeBuckets:
    - bucket1
    - bucket2
    includeObjects:
    - object1
    - object2
    unauthenticated: {}
  name: GCS
  scanInterval: 43200s
  type: SOURCE_TYPE_GCS
  verify: true

Gerrit

If you omit providing projects then all code projects that the credential can list and access will be scanned.

sources:
- connection:
    '@type': type.googleapis.com/sources.Gerrit
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: scanner-account
    endpoint: https://gerrit.example.com
  name: Gerrit
  scanInterval: 43200s
  type: SOURCE_TYPE_GERRIT
  verify: true

Git

The Git source expects a list repository URIs and/or a list of local directories with repositories to scan.

Unauthenticated

sources:
- connection:
    '@type': type.googleapis.com/sources.Git
    directories:
    - /home/me/dev/vscode
    repositories:
    - https://github.com/dustin-decker/secretsandstuff.git
    unauthenticated: {}
  name: Git
  scanInterval: 43200s
  type: SOURCE_TYPE_GIT
  verify: true

Basic Auth

sources:
- connection:
    '@type': type.googleapis.com/sources.Git
    basicAuth:
      password: clonePassword
      username: cloneUser
    repositories:
    - https://github.com/dustin-decker/secretsandstuff.git
  name: Git
  scanInterval: 43200s
  type: SOURCE_TYPE_GIT
  verify: true

SSH Auth

sources:
- connection:
    '@type': type.googleapis.com/sources.Git
    repositories:
    - ssh://github.com/dustin-decker/secretsandstuff.git
    sshAuth: {}
  name: Git
  scanInterval: 43200s
  type: SOURCE_TYPE_GIT
  verify: true

GitHub

Personal Access Tokens should be created with the following scopes: repo, gist, and read:org

sources:
- connection:
    '@type': type.googleapis.com/sources.GitHub
    endpoint: https://github.ourbusiness.com
    ignoreRepos:
    - trufflesecurity/trufflehog
    - torvalds/linux
    includeForks: true
    organizations:
    - trufflesecurity
    repositories:
    - https://github.ourbusiness.com/torvalds/linux.git
    scanUsers: true
    token: XXXXXXXXXXXXXXXXXXXXXXXXXX
  name: GitHub
  scanInterval: 43200s
  type: SOURCE_TYPE_GITHUB
  verify: true

GitLab

Token Auth

The GitLab token should be created with the read_api scope.

sources:
- connection:
    '@type': type.googleapis.com/sources.GitLab
    endpoint: https://gitlab.ourbusiness.com
    token: XXXXXXXXXXXXXXXXXXXXXXXXXX
  name: GitLab
  scanInterval: 43200s
  type: SOURCE_TYPE_GITLAB
  verify: true

Basic Auth

sources:
- connection:
    '@type': type.googleapis.com/sources.GitLab
    basicAuth:
      password: t0ken
      username: svc-user
    endpoint: https://gitlab.ourbusiness.com
    ignoreRepos:
    - trufflesecurity/trufflehog
    - torvalds/linux
  name: GitLab
  scanInterval: 43200s
  type: SOURCE_TYPE_GITLAB
  verify: true

Jenkins

sources:
- connection:
    '@type': type.googleapis.com/sources.Jenkins
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: scanner-account
    endpoint: https://jenkins.example.com
  name: Jenkins logs and artifacts
  scanInterval: 43200s
  type: SOURCE_TYPE_JENKINS
  verify: true

JIRA

Basic authentication with an email address for the username and a JIRA cloud token for the password must be configured using basic authentication for JIRA Cloud. Creating a token on Atlassian Cloud

For on-premise JIRA instances, you can use a username and password with basic authentication, or you can use a personal access token (PAT) with token authentication.

If you omit providing projects then all projects that the credential can list and access will be scanned.

JIRA with basic authentication

sources:
- connection:
    '@type': type.googleapis.com/sources.JIRA
    basicAuth:
      password: XXXXXXXXXXXXXXXXXXXXXXXXXX
      username: [email protected]
    endpoint: https://ourbusiness.atlassian.net
    projects:
    - ENG
    - ITSYS
  name: JIRA
  scanInterval: 43200s
  type: SOURCE_TYPE_JIRA
  verify: true

JIRA with personal access token (PAT)

sources:
- connection:
    '@type': type.googleapis.com/sources.JIRA
    endpoint: https://ourbusiness.atlassian.net
    projects:
    - ENG
    - ITSYS
    token: XXXXXXXXXXXXXXXXXXXXXXXXXX
  name: JIRA
  scanInterval: 43200s
  type: SOURCE_TYPE_JIRA
  verify: true

Microsoft Teams

During the initial setup of the Teams integration, an admin account will need to provide permission as part of the OAuth2 flow.

To enable the necessary functionality in the Teams integration, the following scopes are required:

  • ChannelMessage.Read.All: This scope allows Trufflhog to access and read messages in all public and private channels within the Teams workspace, unless specific include/exclude filters are applied during the setup process. Please note that scanning direct messages is currently not supported.

Currently, each integration can only be configured to scan a single team. If you want to scan multiple teams, you will need to create separate integrations. However, we plan to enhance this capability in the future, allowing multiple teams to be scanned from a single integration.

When configuring the Teams scanner from the UI Dashboard the Team ID is referencing your Microsoft Teams ID number. This can be found by going into your Teams app in the lefthand pane, click the … button next to the team, and click “Get link to team”. Please be sure the menu is for the team and not for a channel.

Get Team Link Location

The Team ID will come right after groupId in the link provided. (ex. xxxgroupId=&tenantId=xxx). Get Teams Team ID

The Teams integration requires the web UI in order to successfully scan sources. (Local config will be made available in the near future with the use of Client Credentials or Oauth2.)

Microsoft SharePoint

During the initial setup of the SharePoint integration, an admin account will need to provide permission as part of the OAuth2 flow.

SharePoint OAuth2 Flow

To enable the necessary functionality in the SharePoint integration, the following scopes are required:

  • AllSites.Read: This scope allows the scanner to access and read all sites within your SharePoint workspace.
  • Sites.Search.All: This scope allows the scanner to navigate through all the contents within your SharePoint workspace.
  • offline_access: This scope allows Trufflehog Enterprise to maintain the state of the secrets detected by the scanner.

When configuring the SharePoint scanner from the UI, the Site URL is referencing your Microsoft SharePoint site.

The SharePoint integration requires the web UI in order to successfully scan sources. (Local config will be made available in the near future with the use of Client Credentials or Oauth2.)

Slack

sources:
- connection:
    '@type': type.googleapis.com/sources.Slack
    channels:
    - General
    - Random
    endpoint: https://mybusiness.slack.com
    ignoreList:
    - General
    token: XXXXXXXXXXXXXXXXXXXXXXXXXX
  name: Slack
  scanInterval: 43200s
  type: SOURCE_TYPE_SLACK
  verify: true

Single Workspace App

If you are able, we recommend using the Slack install from the UI because not only is it much easier, but it also scans faster because it has higher rate limits.

You may create your own single workspace Slack app to utilize with TruffleHog and provide the refresh token in the token field in the example above. Below are the steps to create the app.

  1. Start creating the app here

  2. Give the app a name a choose the workspace you want to Trufflehog to operate on. (You will need seperate apps to utilize a multiple workspaces)

Name your app

  1. Update the “User Token Scopes” section with the following scopes:
  • users:read
  • users:read.email
  • channels:history
  • channels:read
  • groups:history
  • groups:read
  • files:read

Add user permissions

  1. Make sure everything is saved and looks correct, then install your app!

Install your app

  1. If your user does not permissions to install the app it may send a request to your Slack admin asking for them to approve it. If so, it may be a good idea to give them a heads up before you do this :)

Approve the app install

  1. Copy your newly minted token and paste it into the token field from the local configuration file above (TIP: Remove the channels line and values if you want trufflehog to scan all accessible channels.)

Copy token to config Paste your token into your config

  1. Once you run your local scan, Trufflehog will pick up and scan the configured slack source!

S3

If you omit providing buckets then all buckets that the credential can list and access will be scanned.

Example IAM policy:

{
	"Version":"2012-10-17",
	"Statement":[
		{
			"Effect":"Allow",
			"Action":[
				"s3:GetBucketLocation",
				"s3:ListAllMyBuckets",
				"s3:ListBucket",
				"s3:GetObject"
			],
			"Resource":"*"
		}
	]
}

Configuration:

sources:
- connection:
    '@type': type.googleapis.com/sources.S3
    cloudEnvironment: {}
  name: S3
  scanInterval: 43200s
  type: SOURCE_TYPE_S3
  verify: true

S3 with static credentials

sources:
- connection:
    '@type': type.googleapis.com/sources.S3
    accessKey:
      key: AKIAKEYID
      secret: XXXXXXXXXXXXXXXXXXXXXXXXXX
    buckets:
    - bucket-one
    - bucket-two
  name: S3
  scanInterval: 43200s
  type: SOURCE_TYPE_S3
  verify: true

S3 with AWS IAM Role Assumption

sources:
- connection:
    '@type': type.googleapis.com/sources.S3
    roles:
    - roleArn-1
    - roleArn-2
    sessionToken: {}
  name: S3
  scanInterval: 43200s
  type: SOURCE_TYPE_S3
  verify: true

IAM roles can be assumed by an IAM entity, such as a user or role, that is an allowed principal in an IAM trust policy attached to the role.

Example trust policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
	  "AWS": "arn:aws:iam::123456789012:user/Bob"
	},
      "Action": "sts:AssumeRole"
    }
  ]
}

Otherwise, roles behave similar to IAM credential(user) access. Passing a specific role ARN and bucket will result in only that bucket being scanned.

Passing in a role ARN without specifying a bucket will result in all buckets that the role can list being scanned. Multiple roles can be specified as individual arguments.

If a bucket or buckets are supplied in addition to multiple roles, a scan will be attempted against each bucket by each role.

The recommended approach is to define an IAM user or service account with minimal or no local resource permissions, and define it as an allowed principal in a trust policy using the example above. This policy will then need to be attached to roles in every environment or AWS account in which credential scans are needed.


Last updated on 09-25-2023