Git Integration

Data Science Lab allows Git integration at the project creation level, such projects are termed as "Git Sync Project" .

Git Sync Project is a type of data science project that integrates with Git from the moment it's created. This integration allows for version control directly within the Data Science Lab, ensuring that all project assets are managed and synchronized with a Git repository.

Git Sync enables seamless integration between the Data Science Lab (DSLab) and your GitHub or GitLab repositories. With Git Sync, you can:

  • Access and edit repository files directly inside DSLab.

  • Run Git commands via the built-in Git Console.

  • Export scripts from DSLab to the Data Pipeline module for job registration.

This ensures collaborative development and version control for your analytics projects.

Prerequisites

  • A GitHub or GitLab account with a repository.

  • Access token and credentials configured under My Account > Configuration.

  • (Optional) Admin-configured repository access if using centralized Git settings.

Configure Git Sync in a DSLab Project

Navigation path: DSLab > Create Project > Git Configuration

  1. Navigate to the DSLab module.

  2. Click Create Project.

  3. Fill in all required fields (Project Name, Environment, Resources, etc.).

  4. Under Git Settings:

  5. Select the Git Repository.

  6. Select the Git Branch.

  7. Enable Sync Git Repo at project creation to access all files in the repository.

  8. Click Save to create the project.

✅ Once created, the repository files will be available under the Repo section of the Notebook tab.

Access Repository Files

  1. Open the created project.

  2. Expand the Repo option in the Notebook tab.

  3. View all files synced from the configured Git repository.

Git Console

  • The Git Console is available at the bottom of the page in a DSLab project.

  • It allows you to execute Git commands directly from the UI.

  • Supports commands such as cloning, branching, committing, and pushing code.

Sample Git Commands

Git Command
Associated Action

git init

Initializes a new Git repository in the current directory.

git clone <repository-url>

Clones a remote repository into a new local directory.

git add <file>

Stages changes for the next commit.

git commit -m "message"

Records staged changes in the repository with a descriptive message.

git status

Displays the current status of changes.

git log

Shows commit history with IDs, authors, and messages.

git branch

Lists all branches in the repository.

git checkout <branch-name>

Switches to the specified branch.

git merge <branch-name>

Merges changes from another branch.

git pull

Fetches and merges changes from a remote repository.

git push

Pushes local commits to a remote repository.

git remote -v

Lists remote repositories linked to the local repository.

git fetch

Retrieves changes from a remote repository without merging.

git diff

Shows differences between working directory and last commit.

git rm <file>

Removes a file and stages its removal.

git stash

Temporarily stores uncommitted changes.

To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.

To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.

Notes:

  • Only .ipynb files (Jupyter Notebooks) can be exported directly for use in Pipelines and Jobs.

  • .py files can only be used as Utility files. See Utility Scripts.

  • Once synced with Git, you can export scripts to the Data Pipeline and register them as jobs.

    • See: Exporting a Script from DSLab