Locally-configured sources
Adding a Source configuration
Locally-configured sources are configured in your config.yaml
file under the sources
field.
Example config
concurrency: "8"
filterUnverified: true
logLevel: info
numWorkers: 16
sources:
- connection:
'@type': type.googleapis.com/sources.Confluence
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: [email protected]
endpoint: https://ourbusiness.atlassian.net/wiki
name: Confluence
scanInterval: 43200s
type: SOURCE_TYPE_CONFLUENCE
verify: true
trufflehogAddress: https://gnarly-flying-pancake.c1.prod.trufflehog.org
trufflehogScannerGroup: account 1 - us-west-2
trufflehogScannerToken: thog-agent-XXXXXXXXXXXXXXXXXXXXXXXXXX
Config Definitions:
RunOnce: This field determines the execution mode of the scanner. If set to true, the sources specified will be scanned only once, after which the scanner will terminate. This implies that the “ScanInterval” value associated with each source will be ignored, and periodic scanning will not take place.
Concurrency: This value sets the number of sources that can be scanned at the same time. It essentially controls the parallelism level of the scanning process. If this value is not explicitly set, it defaults to the number of cores available on the machine running the scanner, maximizing hardware utilization.
NumWorkers: This field specifies the number of workers allocated for the detection process. Each worker is an independent unit that can process a portion of the task in parallel with other workers. If left unspecified, the default value is the number of cores present on the machine, meaning each core would have one worker assigned to it.
FilterUnverified: This field acts as a filter on the scanner’s output. If set to true, the scanner will limit its output for unverified results. Specifically, if a chunk of data yields more than one unverified result from a detector, only the first result will be included in the output. This reduces the noise in the case of multiple unverified detections. This filtering does not apply to verified results, which will be outputted normally. By reducing the volume of unverified detections reported, this setting can help focus attention on verified findings.
Artifactory
Artifactory with Access Token
It is recommended to generate an access token for a user with read-only
permissions. To do so, create a new user in the JFrog Artifactory UI under
“Identity and Access.” Leave all roles unchecked and ensure the user is added
to the readers
group (selected by default). Once created, navigate to
the “Access Tokens” tab and generate a token for the newly created user.
sources:
- connection:
'@type': type.googleapis.com/sources.Artifactory
accessToken: access_token
endpoint: https://example.jfrog.io
repositories:
- repo1
- repo2
name: Artifactory repository artifacts
scanInterval: 43200s
type: SOURCE_TYPE_JFROG_ARTIFACTORY
verify: true
Artifactory with Basic Authentication
Alternatively, basic authentication can be used.
sources:
- connection:
'@type': type.googleapis.com/sources.Artifactory
basicAuth:
password: secret
username: username
endpoint: https://example.jfrog.io
repositories:
- repo1
- repo2
name: Artifactory repository artifacts
scanInterval: 43200s
type: SOURCE_TYPE_JFROG_ARTIFACTORY
verify: true
password
may be one of:
- Access token
- Account password
- API key
Azure Repos
Azure Repos can currently be scanned using a personal access token (PAT). To create a PAT, follow these steps:
- Go to your Azure DevOps account and click on the “User Settings” icon in the top right corner next to your profile picture.
- Click on “Personal access tokens”.
- Click on “New Token”.
- Enter a name for the token, select an organization and select the “Custom defined” option. Then, select the “Code (read)” scope. Please make sure that the “All accessible organizations” option is selected if you want to scan all repositories from all organizations.
- Click on “Create”.
When providing organizations, projects and repositories in the config, please take note of the following:
- At least one organization is required.
- Hierarchy: organizations > projects > repositories. Ensure projects are from specified organizations, and repositories are from specified projects.
- Specifying only “organizations” will result in scanning all their projects. Specifying only “projects” will scan all their repositories.
- The “ignore” filter always overrides the “include” filter, applicable to both “projects” and “repositories”.
Note: OAuth and basic authentication will be supported in the future.
sources:
- connection:
'@type': type.googleapis.com/sources.AzureRepos
ignoreProjects:
- Project2
ignoreRepos:
- https://dev.azure.com/trufflescurity/IgnoreRepo
includeForks: true
organizations:
- trufflesecurity
projects:
- Project1
repositories:
- https://dev.azure.com/trufflescurity/RepoRascal
token: XXXXXXXXXXXXXXXXXXXXXXXXXX
name: Azure Repos
scanInterval: 43200s
type: SOURCE_TYPE_AZURE_REPOS
verify: true
BitBucket
You have three options depending on your BitBucket instance and the authentication method you prefer:
1. BitBucket Server with Basic Authentication
In this method, you can use an App Password or a token for authentication. Follow these steps:
sources:
- connection:
'@type': type.googleapis.com/sources.Bitbucket
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: scanner-account
endpoint: https://bitbucket.ourbusiness.com
ignoreRepos:
- https://bitbucket.ourbusiness.com/linux-kernel/ignore.git
- https://bitbucket.ourbusiness.com/torvalds/ignore2.git
repositories:
- https://bitbucket.ourbusiness.com/linux-kernel/linux.git
- https://bitbucket.ourbusiness.com/torvalds/linux.git
name: BitBucket Server
scanInterval: 43200s
type: SOURCE_TYPE_BITBUCKET
verify: true
Please note:
If you use an App Password, ensure that it has Read access for both the Account and Repositories. If you use a token, it can be used in place of the password.
2. BitBucket Cloud with Personal Access Token (PAT)
For cloud-hosted BitBucket instances, you can also use a Personal Access Token (PAT). Follow these steps:
sources:
- connection:
'@type': type.googleapis.com/sources.Bitbucket
endpoint: https://bitbucket.org/myworkspace
ignoreRepos:
- https://bitbucket.ourbusiness.com/linux-kernel/ignore.git
- https://bitbucket.ourbusiness.com/torvalds/ignore2.git
repositories:
- https://bitbucket.ourbusiness.com/linux-kernel/linux.git
- https://bitbucket.ourbusiness.com/torvalds/linux.git
token: ATCTTxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
name: BitBucket Cloud Token Auth
scanInterval: 43200s
type: SOURCE_TYPE_BITBUCKET
verify: true
BitBucket provides three types of access tokens. Among them, we recommend using the “Workspace Access Token” as it provides access to all projects and repositories.
Please note:
If you specify both the “repositories” and “ignoreRepos” fields, the application will prioritize the “repositories” field. To avoid confusion, we recommend specifying only one of these fields.
3. BitBucket Cloud with Basic Authentication
If you’re using a cloud-hosted BitBucket instance, you can use basic authentication. Follow these steps:
sources:
- connection:
'@type': type.googleapis.com/sources.Bitbucket
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: scanner-account
endpoint: https://bitbucket.org/myworkspace
ignoreRepos:
- https://bitbucket.ourbusiness.com/linux-kernel/ignore.git
- https://bitbucket.ourbusiness.com/torvalds/ignore2.git
repositories:
- https://bitbucket.ourbusiness.com/linux-kernel/linux.git
- https://bitbucket.ourbusiness.com/torvalds/linux.git
name: BitBucket Cloud Basic Auth
scanInterval: 43200s
type: SOURCE_TYPE_BITBUCKET
verify: true
Please note:
The password must have Read access for both the Account and Repositories. Tokens CANNOT be used in place of the password for BitBucket Cloud.
Buildkite
Your API Access Token must have GraphQL API access enabled along with the following REST API Scopes: Organization Access, Read Artifacts, Read Builds, Read Build Logs, and Read Pipelines.
sources:
- connection:
'@type': type.googleapis.com/sources.Buildkite
token: XXXXXXXXXXXXXXXXXXXXXXXXXX
name: Buildkite logs and artifacts
scanInterval: 43200s
type: SOURCE_TYPE_BUILDKITE
verify: true
Confluence
Basic authentication with an email address for the username and a Confluence cloud token
for the password must be configured using basic authentication for Confluence Cloud.
For on-premise Confluence instances, you can use a username and password with basic authentication, or you can use a personal access token (PAT) with token authentication.
Confluence with basic authentication
sources:
- connection:
'@type': type.googleapis.com/sources.Confluence
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: [email protected]
endpoint: https://ourbusiness.atlassian.net/wiki
ignoreSpaces:
- Space1
includeAttachments: true
skipHistory: true
spaces:
- Space2
- Random-Space
name: Confluence
scanInterval: 43200s
type: SOURCE_TYPE_CONFLUENCE
verify: true
Confluence with personal access token (PAT)
sources:
- connection:
'@type': type.googleapis.com/sources.Confluence
endpoint: https://ourbusiness.atlassian.net/wiki
ignoreSpaces:
- Space2
includeAttachments: true
skipHistory: true
spaces:
- Space1
- Random-Space
token: XXXXXXXXXXXXXXXXXXXXXXXXXX
name: Confluence
scanInterval: 43200s
type: SOURCE_TYPE_CONFLUENCE
verify: true
Docker
The Docker integration supports using unauthenticated scans, the docker keychain (docker login), bearer token, and basic authentication. If you provide images without a tag, then latest will be assumed.
Docker with no authentication
sources:
- connection:
'@type': type.googleapis.com/sources.Docker
images:
- trufflesecurity/secrets
unauthenticated: {}
name: Docker
scanInterval: 43200s
type: SOURCE_TYPE_DOCKER
verify: true
Docker with Docker keychain authentication
sources:
- connection:
'@type': type.googleapis.com/sources.Docker
dockerKeychain: true
images:
- trufflesecurity/secrets
name: Docker
scanInterval: 43200s
type: SOURCE_TYPE_DOCKER
verify: true
Docker with basic authentication
sources:
- connection:
'@type': type.googleapis.com/sources.Docker
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: user
images:
- trufflesecurity/secrets
name: Docker
scanInterval: 43200s
type: SOURCE_TYPE_DOCKER
verify: true
Docker with bearer token authentication
sources:
- connection:
'@type': type.googleapis.com/sources.Docker
bearerToken: token-value
images:
- trufflesecurity/secrets
name: Docker
scanInterval: 43200s
type: SOURCE_TYPE_DOCKER
verify: true
Filesystem
sources:
- connection:
'@type': type.googleapis.com/sources.Filesystem
directories:
- /home/me/dev
name: Filesystem
scanInterval: 43200s
type: SOURCE_TYPE_FILESYSTEM
verify: true
File and Stdin
Help
$ trufflehog file --help
usage: TruffleHog file [<flags>] [<path>]
Scan a file (defaults to standard in)
Flags:
--help Show context-sensitive help (also try --help-long and --help-man).
-v, --debug Enable debug mode.
--trace Enable tracing of code line numbers.
--json Enable JSON output.
--send-error-telemetry Turns error telemetry off.
--quiet Only show results.
Args:
[<path>] Path of the file to scan
Example
You will need to obtain credentials to run this. You can get them by creating a scanner group (on your isolated instance go to settings -> scanners) and downloading the config.
Tip: run with --no-update
if doing frequent invocations to cut down on startup time by ignoring updates
# 3 different ways you can invoke stdin and file scanner
./trufflehog file --config config.yaml --json /etc/passwd
cat /etc/password | ./trufflehog file --config config.yaml --json
./trufflehog file --config config.yaml --json < /etc/password
When using Docker, you must include the --interactive
or -i
flag (but not -t
or --tty
) for Docker to past the stdin to TruffleHog:
docker run --net=host --restart=unless-stopped -v $(pwd)/config.yaml:/tmp/config.yaml -i us-docker.pkg.dev/thog-artifacts/public/scanner:latest file --config=/tmp/config.yaml
GCS (Google Cloud Storage)
ProjectID is required. If you omit providing buckets then all buckets that the credential can list and access will be scanned.
When using the include/exclude filters for both buckets and objects, the include filters take precedence if both are specified. It is recommended to only use one of the two filters for each.
GCS with GCP IAM credentials (recommended)
Example IAM policy:
{
"version": "1",
"bindings": [
{
"role": "roles/storage.objectViewer",
"members": [
"user:<user_email>"
]
},
{
"role": "roles/viewer",
"members": [
"user:<user_email>"
]
}
]
}
Configuration:
sources:
- connection:
'@type': type.googleapis.com/sources.GCS
adc: {}
excludeBuckets:
- bucket3
excludeObjects:
- object3
includeBuckets:
- bucket1
- bucket2
includeObjects:
- object1
- object2
projectId: my-project (REQUIRED)
name: GCS
scanInterval: 43200s
type: SOURCE_TYPE_GCS
verify: true
GCS with service account file. (JSON)
Example IAM policy:
{
"version": "1",
"bindings": [
{
"role": "roles/storage.objectViewer",
"members": [
"user:<user_email>"
]
},
{
"role": "roles/viewer",
"members": [
"user:<user_email>"
]
}
]
}
Configuration:
sources:
- connection:
'@type': type.googleapis.com/sources.GCS
excludeBuckets:
- bucket3
excludeObjects:
- object3
includeBuckets:
- bucket1
- bucket2
includeObjects:
- object1
- object2
projectId: my-project (REQUIRED)
serviceAccountFile: /path/to/service-account.json
name: GCS
scanInterval: 43200s
type: SOURCE_TYPE_GCS
verify: true
GCS without authentication.
Can only be used for public buckets.
Configuration:
sources:
- connection:
'@type': type.googleapis.com/sources.GCS
excludeObjects:
- object3
includeBuckets:
- bucket1
- bucket2
includeObjects:
- object1
- object2
unauthenticated: {}
name: GCS
scanInterval: 43200s
type: SOURCE_TYPE_GCS
verify: true
Gerrit
If you omit providing projects then all code projects that the credential can list and access will be scanned.
sources:
- connection:
'@type': type.googleapis.com/sources.Gerrit
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: scanner-account
endpoint: https://gerrit.example.com
name: Gerrit
scanInterval: 43200s
type: SOURCE_TYPE_GERRIT
verify: true
Git
The Git source expects a list repository URIs and/or a list of local directories with repositories to scan.
Unauthenticated
sources:
- connection:
'@type': type.googleapis.com/sources.Git
directories:
- /home/me/dev/vscode
repositories:
- https://github.com/dustin-decker/secretsandstuff.git
unauthenticated: {}
name: Git
scanInterval: 43200s
type: SOURCE_TYPE_GIT
verify: true
Basic Auth
sources:
- connection:
'@type': type.googleapis.com/sources.Git
basicAuth:
password: clonePassword
username: cloneUser
repositories:
- https://github.com/dustin-decker/secretsandstuff.git
name: Git
scanInterval: 43200s
type: SOURCE_TYPE_GIT
verify: true
SSH Auth
sources:
- connection:
'@type': type.googleapis.com/sources.Git
repositories:
- ssh://github.com/dustin-decker/secretsandstuff.git
sshAuth: {}
name: Git
scanInterval: 43200s
type: SOURCE_TYPE_GIT
verify: true
GitHub
Personal Access Tokens should be created with the following scopes: repo
, gist
, and read:org
sources:
- connection:
'@type': type.googleapis.com/sources.GitHub
endpoint: https://github.ourbusiness.com
ignoreRepos:
- trufflesecurity/trufflehog
- torvalds/linux
includeForks: true
organizations:
- trufflesecurity
repositories:
- https://github.ourbusiness.com/torvalds/linux.git
scanUsers: true
token: XXXXXXXXXXXXXXXXXXXXXXXXXX
name: GitHub
scanInterval: 43200s
type: SOURCE_TYPE_GITHUB
verify: true
GitLab
Token Auth
The GitLab token should be created with the read_api
scope.
sources:
- connection:
'@type': type.googleapis.com/sources.GitLab
endpoint: https://gitlab.ourbusiness.com
token: XXXXXXXXXXXXXXXXXXXXXXXXXX
name: GitLab
scanInterval: 43200s
type: SOURCE_TYPE_GITLAB
verify: true
Basic Auth
sources:
- connection:
'@type': type.googleapis.com/sources.GitLab
basicAuth:
password: t0ken
username: svc-user
endpoint: https://gitlab.ourbusiness.com
ignoreRepos:
- trufflesecurity/trufflehog
- torvalds/linux
name: GitLab
scanInterval: 43200s
type: SOURCE_TYPE_GITLAB
verify: true
Jenkins
sources:
- connection:
'@type': type.googleapis.com/sources.Jenkins
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: scanner-account
endpoint: https://jenkins.example.com
name: Jenkins logs and artifacts
scanInterval: 43200s
type: SOURCE_TYPE_JENKINS
verify: true
JIRA
Basic authentication with an email address for the username and a JIRA cloud token
for the password must be configured using basic authentication for JIRA Cloud.
For on-premise JIRA instances, you can use a username and password with basic authentication, or you can use a personal access token (PAT) with token authentication.
If you omit providing projects then all projects that the credential can list and access will be scanned.
JIRA with basic authentication
sources:
- connection:
'@type': type.googleapis.com/sources.JIRA
basicAuth:
password: XXXXXXXXXXXXXXXXXXXXXXXXXX
username: [email protected]
endpoint: https://ourbusiness.atlassian.net
projects:
- ENG
- ITSYS
name: JIRA
scanInterval: 43200s
type: SOURCE_TYPE_JIRA
verify: true
JIRA with personal access token (PAT)
sources:
- connection:
'@type': type.googleapis.com/sources.JIRA
endpoint: https://ourbusiness.atlassian.net
projects:
- ENG
- ITSYS
token: XXXXXXXXXXXXXXXXXXXXXXXXXX
name: JIRA
scanInterval: 43200s
type: SOURCE_TYPE_JIRA
verify: true
Microsoft Teams
During the initial setup of the Teams integration, an admin account will need to provide permission as part of the OAuth2 flow.
To enable the necessary functionality in the Teams integration, the following scopes are required:
- ChannelMessage.Read.All: This scope allows Trufflhog to access and read messages in all public and private channels within the Teams workspace, unless specific include/exclude filters are applied during the setup process. Please note that scanning direct messages is currently not supported.
Currently, each integration can only be configured to scan a single team. If you want to scan multiple teams, you will need to create separate integrations. However, we plan to enhance this capability in the future, allowing multiple teams to be scanned from a single integration.
When configuring the Teams scanner from the UI Dashboard the Team ID is referencing your Microsoft Teams ID number. This can be found by going into your Teams app in the lefthand pane, click the … button next to the team, and click “Get link to team”. Please be sure the menu is for the team and not for a channel.
The Team ID will come right after groupId in the link provided. (ex. xxxgroupId=&tenantId=xxx).
The Teams integration requires the web UI in order to successfully scan sources. (Local config will be made available in the near future with the use of Client Credentials or Oauth2.)
Microsoft SharePoint
During the initial setup of the SharePoint integration, an admin account will need to provide permission as part of the OAuth2 flow.
To enable the necessary functionality in the SharePoint integration, the following scopes are required:
- AllSites.Read: This scope allows the scanner to access and read all sites within your SharePoint workspace.
- Sites.Search.All: This scope allows the scanner to navigate through all the contents within your SharePoint workspace.
- offline_access: This scope allows Trufflehog Enterprise to maintain the state of the secrets detected by the scanner.
When configuring the SharePoint scanner from the UI, the Site URL is referencing your Microsoft SharePoint site.
- Log into SharePoint
- Click into a SharePoint site
- From the URL in the address bar, copy over only the portion up to (and including)
.com
. Example https://trufflesecurity.sharepoint.com/
The SharePoint integration requires the web UI in order to successfully scan sources. (Local config will be made available in the near future with the use of Client Credentials or Oauth2.)
Slack
sources:
- connection:
'@type': type.googleapis.com/sources.Slack
channels:
- General
- Random
endpoint: https://mybusiness.slack.com
ignoreList:
- General
token: XXXXXXXXXXXXXXXXXXXXXXXXXX
name: Slack
scanInterval: 43200s
type: SOURCE_TYPE_SLACK
verify: true
Single Workspace App
If you are able, we recommend using the Slack install from the UI because not only is it much easier, but it also scans faster because it has higher rate limits.
You may create your own single workspace Slack app to utilize with TruffleHog and provide the refresh token in the token field in the example above. Below are the steps to create the app.
Start creating the app here
Give the app a name a choose the workspace you want to Trufflehog to operate on. (You will need seperate apps to utilize a multiple workspaces)
- Update the “User Token Scopes” section with the following scopes:
- users:read
- users:read.email
- channels:history
- channels:read
- groups:history
- groups:read
- files:read
- Make sure everything is saved and looks correct, then install your app!
- If your user does not permissions to install the app it may send a request to your Slack admin asking for them to approve it. If so, it may be a good idea to give them a heads up before you do this :)
- Copy your newly minted token and paste it into the token field from the local configuration file above (TIP: Remove the channels line and values if you want trufflehog to scan all accessible channels.)
- Once you run your local scan, Trufflehog will pick up and scan the configured slack source!
S3
If you omit providing buckets then all buckets that the credential can list and access will be scanned.
S3 with AWS IAM credentials (recommended)
Example IAM policy:
{
"Version":"2012-10-17",
"Statement":[
{
"Effect":"Allow",
"Action":[
"s3:GetBucketLocation",
"s3:ListAllMyBuckets",
"s3:ListBucket",
"s3:GetObject"
],
"Resource":"*"
}
]
}
Configuration:
sources:
- connection:
'@type': type.googleapis.com/sources.S3
cloudEnvironment: {}
name: S3
scanInterval: 43200s
type: SOURCE_TYPE_S3
verify: true
S3 with static credentials
sources:
- connection:
'@type': type.googleapis.com/sources.S3
accessKey:
key: AKIAKEYID
secret: XXXXXXXXXXXXXXXXXXXXXXXXXX
buckets:
- bucket-one
- bucket-two
name: S3
scanInterval: 43200s
type: SOURCE_TYPE_S3
verify: true
S3 with AWS IAM Role Assumption
sources:
- connection:
'@type': type.googleapis.com/sources.S3
roles:
- roleArn-1
- roleArn-2
sessionToken: {}
name: S3
scanInterval: 43200s
type: SOURCE_TYPE_S3
verify: true
IAM roles can be assumed by an IAM entity, such as a user or role, that is an allowed principal in an IAM trust policy attached to the role.
Example trust policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:user/Bob"
},
"Action": "sts:AssumeRole"
}
]
}
Otherwise, roles behave similar to IAM credential(user) access. Passing a specific role ARN and bucket will result in only that bucket being scanned.
Passing in a role ARN without specifying a bucket will result in all buckets that the role can list being scanned. Multiple roles can be specified as individual arguments.
If a bucket or buckets are supplied in addition to multiple roles, a scan will be attempted against each bucket by each role.
The recommended approach is to define an IAM user or service account with minimal or no local resource permissions, and define it as an allowed principal in a trust policy using the example above. This policy will then need to be attached to roles in every environment or AWS account in which credential scans are needed.