Skip to content

gobeyondidentity/ById_Google_SCIM

Repository files navigation

Google Workspace to Beyond Identity SCIM Sync Tool

Overview

This is a production-ready Python-based SCIM sync tool that synchronizes users and groups from Google Workspace to Beyond Identity. The tool uses Google's Admin SDK API and Beyond Identity's SCIM API to automate user lifecycle management.

What's Included

  • gwbisync.py - Main sync script (~1,747 lines, 20 functions)
  • config.py - Configuration file with placeholder values
  • setup.sh - Automated environment setup script
  • tests/ - 47 unit tests across 4 test files
  • README_CLIENT.md - Client-facing setup guide with step-by-step screenshots

For a full project structure breakdown, see README_CLIENT.md § Project Structure.

Quick Start

Option 1: Using the Setup Script (Recommended)

# Run the automated setup
./setup.sh

# Activate the virtual environment
source venv/bin/activate

# Edit config.py with your credentials
nano config.py

# Run in dry-run mode first (preview changes without applying them)
python3 gwbisync.py --dry-run

# When ready, run in production mode
python3 gwbisync.py

Option 2: Manual Setup

# Create virtual environment
python3 -m venv venv

# Activate it
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Create logs and snapshots directories
mkdir -p logs snapshots

# Configure credentials
nano config.py

# Run in dry-run mode first
python3 gwbisync.py --dry-run

# Run the sync
python3 gwbisync.py

Configuration

Required Configuration Variables

All configuration is done in config.py. You must replace these placeholder values:

Variable Description Example Placeholder
GWS_SUPER_ADMIN_EMAIL Google Workspace admin email "your.admin@yourdomain.com"
GWS_DOMAIN_NAME Your Google Workspace domain "yourdomain.com"
SERVICE_ACCOUNT_KEY_PATH Path to service account JSON "your-service-account-key.json"
GWS_GROUPS List of group emails to sync ["group1@yourdomain.com"]
BI_TENANT_API_TOKEN Beyond Identity API token "your_beyond_identity_api_token_here"

Optional Configuration Variables

Variable Default Description
BI_GROUP_PREFIX "GoogleSCIM_" Prefix added to group names in Beyond Identity
LOGGING_ENABLED True Enable file logging to logs/ directory
LOGGING_LEVEL "INFO" Log level: DEBUG, INFO, WARNING, ERROR
LOGGING_DIR "logs" Directory for log files
LOGGING_FILE_FORMAT "sync_%Y%m%d_%H%M%S.log" Log filename pattern
TEST_MODE False Legacy config option; prefer using --dry-run CLI flag instead
RATE_LIMIT_DELAY 2.0 Delay between API calls (seconds)

⚠️ Security Note: The config.py file is included in git with placeholder values as a template. After cloning, edit it with your real credentials but DO NOT commit your changes back to git. Restrict file permissions after editing: chmod 600 config.py

Google Cloud & Beyond Identity Setup

For step-by-step setup instructions with screenshots, see README_CLIENT.md (sections 3.2–3.3).

Quick reference — required OAuth scopes:

https://www.googleapis.com/auth/admin.directory.user.readonly
https://www.googleapis.com/auth/admin.directory.group.readonly
https://www.googleapis.com/auth/admin.directory.group.member.readonly

How the Sync Works

Phase 1: Pre-flight Checks

  • Validates all configuration values (including service account JSON structure and permissions)
  • Checks BI API token expiration (warns if expiring within 7 days)
  • Tests Beyond Identity API connectivity
  • Initializes Google Workspace service (with specific error handling)
  • Fetches existing Beyond Identity users (with pagination support)

Phase 2: Group Processing

For each Google Workspace group configured:

  1. Fetch members from Google Workspace group
  2. Create/get group in Beyond Identity (with BI_GROUP_PREFIX)
  3. Process each user:
    • Create or update user in Beyond Identity via SCIM
    • Add user to the corresponding Beyond Identity group
  4. Remove users no longer in the Google Workspace group

If any group cannot be read (auth failure, permission denied, not found, timeout), the entire sync is canceled with troubleshooting tips. This prevents users from being incorrectly suspended just because their group couldn't be accessed.

Phase 3: Cleanup (Safety Confirmed)

  • Only runs if all groups were successfully read
  • Identifies users not in any configured groups
  • Requires user confirmation before suspending users (or --yes for automated runs)
  • Suspends users and logs all actions to audit_suspensions.csv
  • Removes suspended users from synced groups

Safety Features

  • Read-only for Google Workspace - Never writes back to GWS
  • SCIM-only user management - Manually created BI users are not touched
  • File locking - Prevents concurrent script runs (.gwbisync.lock)
  • Audit logging - All suspensions logged to audit_suspensions.csv
  • Confirmation prompt - Requires approval before suspending users (fails closed in non-interactive environments without --yes)
  • Input sanitization - SCIM filter injection prevention via sanitize_scim_value()
  • Rate limiting - Configurable delay between API calls (default: 1 req/2sec)
  • Retry logic - Exponential backoff for failed API calls (3 retries)
  • Error validation - Comprehensive pre-flight configuration checks
  • Service account validation - Validates JSON structure, required fields, file permissions
  • Token expiry detection - Warns if BI API token is expiring soon or already expired
  • Rollback support - Every sync saves a snapshot for reversal via --rollback
  • Dry-run mode - --dry-run flag previews all changes without applying them
  • Group-read abort - Cancels sync if any Google group can't be read (prevents incorrect suspensions)

API Endpoints Used

Beyond Identity SCIM API

  • Users: https://api.byndid.com/scim/v2/Users
  • Groups: https://api.byndid.com/scim/v2/Groups
  • Native API: https://api.byndid.com/v2/users (for source verification)

Google Workspace Admin SDK

  • Directory API v1 for users, groups, and members

Core Functions (20 total)

Function Purpose
main() Entry point — orchestrates the full sync workflow
validate_configuration() Pre-flight validation of all config values, SA JSON structure, and file permissions
get_api_headers() Returns API headers with current BI token
sanitize_scim_value() SCIM filter injection prevention
get_google_service() Initialize Google Workspace API client (with specific error handling)
get_group_members() Fetch all members from a GWS group (with pagination)
get_user_details() Get detailed user info from GWS (with specific error handling)
create_bi_group() Create or retrieve Beyond Identity group
create_or_update_bi_user() Sync user to Beyond Identity via SCIM
add_user_to_group() Add user to BI group (checks if already member)
remove_user_from_group() Remove user from BI group
suspend_user() Suspend user in Beyond Identity
is_scim_sourced_user() Verify if user was created via SCIM (with caching)
get_bi_user_info() Get user details from BI native API
log_suspension_audit() Write suspension events to CSV audit log
rate_limit() Apply delay between API calls
retry_api_call() Retry failed API calls with exponential backoff
save_snapshot() Save JSON snapshot of all sync actions for rollback
list_snapshots() List available sync snapshots with action counts
rollback_snapshot() Reverse actions from a saved sync snapshot

Logging

Console Output

  • Real-time progress during sync
  • Color-coded status messages (✓, ✗, ⚠)
  • Detailed error messages with troubleshooting hints

File Logging

When LOGGING_ENABLED = True:

  • Logs written to logs/sync_YYYYMMDD_HHMMSS.log
  • Includes timestamps, log levels, and detailed operation info
  • View with: tail -f logs/sync_*.log

Audit Trail

  • audit_suspensions.csv - Records all user suspension attempts
  • Includes: timestamp, user email, user ID, reason, success status

Command-Line Interface

The script supports these CLI flags:

python3 gwbisync.py                   # Run sync — will show who will be suspended and ask for confirmation
python3 gwbisync.py --yes             # Run sync — skips confirmation, suspends automatically (for cron)
python3 gwbisync.py --dry-run         # Preview changes without applying them
python3 gwbisync.py --dry-run -v      # Preview with verbose/debug output
python3 gwbisync.py --list-snapshots  # List available rollback snapshots
python3 gwbisync.py --rollback snapshots/sync_20260302_120000.json  # Rollback a sync
python3 gwbisync.py --help            # Show all options

Tip: On your first run, use python3 gwbisync.py without --yes. The script will list up to 25 users that would be suspended and ask you to confirm before making any changes. Once you're confident the sync is working correctly, add --yes for automated/cron use.

Flag Description
--dry-run Preview all changes without applying them
-v, --verbose Enable debug-level logging output
--yes, -y Skip the suspension confirmation prompt. Required for cron/automation — without this, non-interactive environments will refuse to suspend users. For your first run, omit this flag so you can review who will be suspended before approving.
--list-snapshots List available sync snapshots for rollback
--rollback SNAPSHOT Rollback a previous sync using a snapshot file

Rollback Support

Every production sync automatically saves a JSON snapshot to snapshots/ recording all actions taken (users created, suspended, added/removed from groups). If something goes wrong, you can reverse it:

# See what snapshots are available
python3 gwbisync.py --list-snapshots

# Rollback a specific sync
python3 gwbisync.py --rollback snapshots/sync_20260302_120000.json

Rollback will:

  • Unsuspend users that were suspended
  • Re-add users to groups they were removed from
  • Remove users from groups they were added to
  • Skip user deletions (too dangerous to auto-delete - logged as warning)

Dry-Run Mode

Always test first! Use --dry-run to preview changes before running in production:

  • See what the sync would do without making any changes
  • Preview user creations, updates, and suspensions
  • Verify configuration and API connectivity
  • Safe to run repeatedly - only reads from Google Workspace, never writes

Troubleshooting

Common Issues

"Service account authentication failed"

  • Verify service account JSON file path is correct
  • Check domain-wide delegation is enabled
  • Ensure OAuth scopes are exactly as specified
  • Wait 5-15 minutes after enabling delegation

"Group not found"

  • Verify group email addresses are correct
  • Check service account has permission to read the group
  • Ensure groups exist in Google Workspace

"Beyond Identity API authentication failed"

  • Verify API token is valid and has SCIM permissions
  • Check token hasn't expired
  • Ensure SCIM is enabled for your BI organization

"CANCELING SYNC - Groups could not be read"

  • Verify group email addresses in config.py are correct
  • Check that GWS_SUPER_ADMIN_EMAIL is a super admin in the same domain as the groups
  • Confirm domain-wide delegation is enabled for the service account
  • Ensure the admin.directory.group.readonly API scope is authorized
  • Check that Admin SDK API is enabled in Google Cloud Console
  • If the service account key was recently rotated, update SERVICE_ACCOUNT_KEY_PATH

"Rate limit exceeded"

  • Increase RATE_LIMIT_DELAY in config.py
  • Script will automatically retry with exponential backoff

Unit Tests

The project includes a comprehensive test suite with 47 tests across 4 test files:

# Install test dependencies
pip install -r requirements-test.txt

# Run all tests
python3 -m pytest tests/ -v
Test File Tests What's Covered
test_validate_configuration.py 14 Config validation, SA JSON validation, placeholder detection, file permissions
test_api_calls.py 16 BI group creation, user sync, suspension, SCIM source checking
test_rollback.py 9 Snapshot save/load, rollback logic, error handling
test_rate_limit.py 8 Rate limiting, retry with backoff, error recovery

All tests use mocked HTTP calls - no real API access needed.

Security Best Practices

  • ✅ Never commit config.py with real credentials
  • ✅ Never commit actual service account JSON files
  • ✅ Rotate API tokens regularly (every 90 days recommended)
  • ✅ Restrict service account JSON file permissions (chmod 600)
  • ✅ Test thoroughly with --dry-run before production use
  • ✅ Review audit logs after each sync

Production Deployment

Recommended Workflow

  1. Clone repository
  2. Run ./setup.sh to configure environment
  3. Edit config.py with real credentials
  4. Download service account JSON from Google Cloud
  5. Restrict file permissions: chmod 600 your-service-account-key.json
  6. Preview changes (no modifications made): python3 gwbisync.py --dry-run
  7. Review output and logs carefully
  8. First real run (without --yes): python3 gwbisync.py
    • The script will show you exactly which users will be suspended and ask for confirmation before proceeding — use this to verify everything looks correct
  9. Review audit_suspensions.csv for suspended users
  10. If anything went wrong: python3 gwbisync.py --rollback snapshots/sync_XXXX.json
  11. Once verified, set up cron with --yes for ongoing automated runs (see below)

Scheduling with Cron

# Run sync daily at 2 AM
0 2 * * * cd /path/to/project && source venv/bin/activate && python3 gwbisync.py --yes >> logs/cron.log 2>&1

Important: The --yes flag is required for cron and other non-interactive environments. Without it, the script will refuse to suspend users and exit safely.

License

This tool is provided as-is for internal use. Ensure compliance with your organization's security policies when handling service account credentials and API tokens.


Status: ✅ Production Ready Lines of Code: ~1,747 Functions: 20 Tests: 47 Last Updated: March 2026

About

Google Workspace to Beyond Identity SCIM sync tool for automated user and group provisioning

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors