Export to Snowflake Using Python

Overview

The Procore Analytics Cloud Connect Access tool is a command-line interface (CLI) that helps you configure and manage data transfers from Procore to Snowflake.

It consists of two main components:

  • user_exp.py: Configuration setup utility

  • ds_to_snowflake.py: Data synchronization script

Prerequisites

  • Python is installed on your system

  • Access to Procore Delta Share

  • Snowflake account credentials

  • Download the zipped package from the company level Procore Analytics tool (via Procore Analytics > Getting Started > Connection Options > Snowflake).

  • Install the required dependencies using:

    • pip install -r requirements.txt

Steps

  • Initial Configuration

  • Data Synchronization

  • Delta Share Source Configuration

  • Snowflake Target Configuration

  • Scheduling Options

  • Best Practices

  • Troubleshooting

Initial Configuration

Run the configuration utility using python user_exp.py.

Data Synchronization

After configuration, you have two options to run the data sync:

  • Direct Execution:

    • python ds_to_snowflake.py

  • Scheduled Execution

    • If configured during setup, the job will run automatically according to your Cron schedule.

    • To check scheduling logs, the file `procore_scheduling.log` will be created as soon as scheduling will set up.

    • Also, you can check scheduling by running in terminal command:

For Linux and MacOS:

  • To edit/delete - edit scheduling cron by using:

    Example icon

    ```bash

    EDITOR=nano crontab -e

    ```

  • After running the command above, you should see something similar to:

    Example icon

    2 * * * *
    /Users/your_user/snowflake/venv/bin/python

    /Users/your_user/snowflake/sql_server_python/connection_config.py
    2>&1 | while read line; do echo "$(date) - $line"; done>>

    /Users/your_user/snowflake/sql_server_python/procore_scheduling.log # procore-data-import

  • You also can adjust schedule cron or delete the whole line to stop it running by schedule.

For Windows:

  • Check that the schedule task is created:

    Example icon

    ```
    powershell

    schtasks /query /tn "ProcoreDeltaShareScheduling" /fo LIST /v

    ```

  • To edit/deletescheduling task, open the Task Scheduler.

  • Press Win + R, type taskschd.msc, and press enter.

  • Navigate to the scheduled tasks.

  • In the left pane, expand the Task Scheduler Library.

  • Look for the folder where your task is saved:
    Example: Task Scheduler Library or a custom folder.

  • Find your task.

  • Look for the task name: ProcoreDeltaShareScheduling.

  • Click on it to view the details in the bottom pane.

  • Verify its schedule:

    • Check the Triggers tab to see when the task is set to run.

    • Check the History tab to confirm recent runs.

  • To delete task:

    • Delete task from the GUI.

Delta Share Configuration

  • Creating config.share file

  • Before running the configuration utility, you need to create a config.share file with your Delta Share credentials. The file should be in JSON format:

    Example icon

    {

    "shareCredentialsVersion": 1,"bearerToken": "xxxxxxxxxxxxx","endpoint": "https://nvirginia.cloud.databricks.c...astores/xxxxxx"

    }

  • Required fields:

    • ShareCredentialsVersion: Version number (currently 1).

    • BearerToken: Your Delta Share access token.

    • Endpoint: Your Delta Share endpoint URL.

    • These details can be obtained from the Procore web UI.

  • Steps to create config.share:

    • Create a new file named config.share.

    • Copy the above JSON template.

    • Replace the placeholder values with your actual credentials.

    • Save the file in a secure location.

    • You'll need to provide the path to this file during configuration. When configuring the data source, you'll be asked to provide:* List of tables (comma-separated).* Leave blank to sync all tables.* Example: table1, table2, table3.

  • Path to your config.share file.

Snowflake Configuration

You'll need to provide the following Snowflake details:

  • Authentication (choose one):

    • User Authentication* Username* Password (entered securely)

  • Key Pair Authentication

    • Username

    • Private key file path

    • Private key file password

  • Connection Details:

    • Account identifier

    • Warehouse name

    • Database name

    • Schema name

    • Number of concurrent threads

Scheduling Options

The tool offers the ability to schedule automatic data synchronization.

  • Cron Job Configuration

    • Choose whether to set up a daily job

    • If yes, provide a cron schedule

    • Format: * * * * * (minute hour day-of-month month day-of-week)

    • Example for daily at 2 AM: 0 2 * * *

  • Immediate Execution

    • Option to run the ds_to_snowflake.py immediately after configuration

  • File Structure

    Example icon

    Unset├── requirements.txt # Dependencies├── user_exp.py # Configuration utility├── ds_to_snowflake.py # Data sync script├── config.yaml # Generated configuration├── config.share # Delta Share config file├── procore_scheduling.log # Log of scheduling runs


    Example Usage

    • Step 1: Install dependencies
      $ pip install -r requirements.txt

    • Step 2: Run configuration utility
      $ python user_exp.py

    • Procore Analytics Cloud Connect Access* This CLI will help you choose your source and destination store to access/write Procore data into Snowflake.* Press Enter to Continue.* Enter list of tables (comma-separated), leave it blank for all tables: projects,users,tasks.* Enter path to config.share: /path/to/config.share.* Enter user name: snowflake_user.* What authentication type do you want to use? (user/key_pair): Enter.* 1 for user,* 2 for key-pair:* 1* Enter password: ********* Enter Account: my_account* Enter warehouse: my_warehouse* Enter database name: procore_db* Enter schema name: procore_schema* Enter number of threads: 4* Do you want to configure this as a daily job on cron? (Yes/No): Yes* Enter the schedule in cron format (e.g., * * * * * ): 0 2 * * ** Do you want to execute the job now? (Yes/No): Yes

    • Step 3: Manual execution (if needed)
      $ python ds_to_snowflake.py

  • Configuration Reuse
    The tool saves your configuration in the config.yaml file and offers to reuse previously stored settings:

    • Source configuration can be reused.

    • Target (Snowflake) configuration can be reused.

    • You can choose to update either configuration independently.

Best Practices

  • Authentication

    • Use key pair authentication when possible.

    • Regularly rotate credentials.

    • Use minimal required permissions.

  • Performance

    • Adjust thread count based on your system capabilities.

    • Start with a smaller subset of tables for testing.

Troubleshooting

  • Common issues and solutions:

    • Invalid Authentication Type* Ensure to select either '1' (user) or '2' (key_pair) when prompted.

  • Cron Job Setup

    • Verify you have appropriate system permissions.

    • Check system logs if the job fails to run.

    • Ensure the ds_to_snowflake.py has correct permissions.

    • Verify the cron job setup by checking system logs:
      See `procore_scheduling.log` file.

  • Configuration File

    • Located in the same directory as the script,

    • Named config.yaml.

    • Backup before making any changes.

  • Support

    • Check the script's logging output.

    • Review your config.yaml file.

    • Contact your system administrator for permission-related issues.

    • Reach out to Procore support for Delta Share access issues.

Note: Remember to always backup your configuration before making changes and test new configurations in a non-production environment first.