Skip to content
This repository has been archived by the owner on May 5, 2021. It is now read-only.
/ oss-dashboard Public archive

A dashboard for viewing many GitHub organizations at once.

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE.txt
Unknown
license-monkey.rb
Notifications You must be signed in to change notification settings

amzn/oss-dashboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Amazon Open Source Program GitHub Dashboard

A dashboard for viewing many GitHub organizations, and/or users, at once.

The current version of this code depends on PostgreSQL. If you're looking for the older SQLite dependent version, look at the sqlite_legacy branch.


Screenshot

There are three phases to generating the dashboard.

Phase 1

Sync data from GitHub.

Ruby is used to connect to GitHub, pull down the latest data, and update a PostgreSQL Database.

Phase 2

The latest code is checked out, and review scripts run on the code. Analysis is stored for later use.

Phase 3

An HTML dashboard is generated from the PostgreSQL Database (phase 1) and the analysis of the code (phase 2).

Dependencies

The dashboard assumes the following are installed:

Dependency Use
PostgreSQL Database for local copy of GitHub data (including dev package where applicable)
git Pulls source from GitHub
Ruby Executes scripts (tested on version 2.0.0 and 2.2.1) (including dev package where applicable)
OctoKit Rubygem - 'octokit' Access GitHub API
Licensee Rubygem - 'licensee' Identify licensing, though this should go away when the data is provided by OctoKit
XML Rubygem - 'libxml-ruby' Parse XML files
XSLT Rubygem - 'libxslt-ruby' Process XSLT files
Libz-dev Compression
Sequel Rubygem - 'sequel' Execute SQL queries
Pg Rubygem - 'pg' Ruby PostgreSQL driver

Setup

  • Install the dependencies listed above. Note that the rubygems can be installed by doing 'bundle install' (assuming you have installed bundler: https://github.com/bundler/bundler)
  • Decide how to manage your GitHub personal access token.
    • You can store it in an environment variable named GH_ACCESS_TOKEN; this has the advantage of being harder to accidentally commit.
    • Or you can create a file (outside of the git clone) to contain your GitHub access token. Set the permissions to 600.

To check your token actually works access the following and check the headers: https://api.github.com/user?access_token=YOUR_TOKEN_GOES_HERE

Example file:

 github:
   access_token: 'your github personal access token'
   ssl_verify: false    # Optional - useful for GitHub Enterprise deployments with self-signed certificates

For general use, no specific scopes are required. If you wish to see private organization data (such as Teams, all Members and private Repositories), you will need to enable the 'repo' scope.

  • Create a dashboard configuration file (outside of the git clone).

For an example file, see

  dashboard:
    organizations: ['amzn', 'amznlabs']       # One, or more,
    logins: ['hyandell']                      #   of these 
    repositories: ['amzn/oss-dashboard']      # are required

    data-directory: /full/path/to/directory/to/store/data
    reports: [ 'DocsReporter', 'LicenseReporter', 'BinaryReporter' ]
    db-reports: [ 'UnknownCollaboratorsDbReporter', 'LeftEmploymentDbReporter', 'UnknownMembersDbReporter', 'WikiOnDbReporter', 'EmptyDbReporter', 'UnchangedDbReporter', 'NoIssueCommentsDbReporter', 'NoPrCommentsDbReporter', 'RepoUnownedDbReporter', 'LabelDbReporter', 'AverageIssueCloseDbReporter', 'AveragePrCloseDbReporter' ]
    www-directory: /full/path/to/generate/html/to

    private-access: ['amznlabs']     # Optional
    report-path: ['/full/path/to/directory/of/custom/reports']  # Optional
    db-report-path: ['/full/path/to/directory/of/custom/db-reports']  # Optional
    map-user-script: /full/path/to/script   # Optional

  database:
    engine: 'postgres'
    username: 'USERNAME'
    password: 'PASSWORD'
    server: 'localhost'
    port: 5432
    database: 'DATABASENAME'

database configuration

Configure the database section above by setting the username, password, server, port and database settings. You can manually setup the table, or if it's not setup oss-dashboard will attempt to set it up for you.

organizations

This lists the organizations that you wish to include in your dashboard.

logins

This lists the user logins that you wish to include in your dashboard. Under the hood the dashboard treats these largely the same, storing the data in the same location.

repositories

This lists any repositories you wish to include in your dashboard (i.e. pull in a repository rather than every repository in the organization). It currently only supports repositories in organizations.

data-directory

This is where the scripts will store the database and checked out code.

reports

Which reports you wish to be executed on the code. Note that LicenseReporter both provides a report and uses the Licensee project to identify the basic top level license file.

db-reports

Which reports you wish to be executed on the database.

www-directory

Where you want the dashboard output to go.

Optional: private-access

If your access token is configured so it can see the private side of an organization, adding to this list will enable those features.

Optional: report-path

This is a list of paths to look for custom Reporters.

Optional: db-report-path

This is a list of paths to look for custom Database Reporters.

Optional: map-user-script

Interaction between GitHub's user schema and your own user schema is a common use case for a dashboard. This script is executed to load in your customized data.

The user db schema contains an email address field, to represent your internal login, and an is_employee field (0=not employed), to represent whether they are currently employed. Executing this script is the responsibility of the github-sync/user-mapping subphase.

(Warning - clunky system) The script provides a USER_EMAILS hash of GitHub login to internal email address. It can also provide an updateUserData function to, for example, update the is_employee column.

For example:

  USER_EMAIL = {
    "github-login" => "internal-email-address"
  }

  def updateUserData(feedback, dashboard_config, client, sync_db)
    # code to talk to internal employment database and update the users.is_employee field to 0 if someone has left
  end

Optional: license-hashes

The license report both identifies licenses for the Repository Metrics section, and provides information on why it could not find the license in its report output. When a license file is available and can't be recognized, it includes the Licensee project's hash. One can provide a separate YAML file to identify licenses that the Licensee project is unable to identify.

This is configured in two steps. Firstly, add a line to your dashboard config pointing to your license-hashes.yml file:

    license-hashes: '/full/path/to/license-hashes.yml'

Then creates that license-hashes.yml file with content similar to:

   - name: 'Custom License A'
     hash: 'c189e0a7f6a535af91b0d3e1b1a3de1ea4443d69'
   - name: 'Custom License B'
     hash: '84b3be39b2d06ca7b5afe43b461544f7dd7c2f1a'

These hashes are found on the Repository -> Reports -> License Report, which saves you having to write code against Licensee to identify the hash.

Optional: Hiding private repositories

If you don't want to download private repositories, you can add that as a configuration option:

    hide-private-repositories: true

Running

With the configuration file created, you should execute the following:

  # Instead of providing the --ghconfig file, you can set the 
  # GH_ACCESS_TOKEN environment variable with your access token.
  ruby refresh-dashboard.rb --ghconfig {path to config-github.yml} {path to config-dashboard.yml} 

For large repositories, or for a quick review, you can use the --light flag. This creates a database of only the metadata and generates a dashboard.

  # Instead of providing the --ghconfig file, you can set the 
  # GH_ACCESS_TOKEN environment variable with your access token.
  ruby refresh-dashboard.rb --light --ghconfig {path to config-github.yml} {path to config-dashboard.yml} 

To run only part of the system, you can add an additional argument for the phase desired. This can be useful to fill in data after running the light flag.

  # Instead of providing the --ghconfig file, you can set the 
  # GH_ACCESS_TOKEN environment variable with your access token.
  ruby refresh-dashboard.rb --ghconfig {path to config-github.yml} {path to config-dashboard.yml} {phase}

Available phases are:

Phase Description
init-database Initializes the database file
github-sync Syncs all the data down from GitHub (runs all of the github-sync/ phases below)
github-sync/metadata Syncs only the metadata (org, repo, teams, org-members etc)
github-sync/commits Syncs only the commit data
github-sync/events Syncs only the event stream
github-sync/issues Syncs the issue data - note that this is typically the heaviest initial load
github-sync/issue-comments Syncs the issue comments
github-sync/releases Syncs the release data
github-sync/user-mapping Loads your user-mapping file into the database
github-sync/reporting Runs the configured DB Reports
pull-source Pulls down the source code from GitHub
review-source Runs your source Reports on the pulled source code
generate-dashboard Generates a dashboard (runs all of the generate-edashboard/ phases below)
generate-dashboard/xml Outputs the XML for organizations
generate-dashboard/merge Merges the organization XML into a single XML file
generate-dashboard/teams-xml Splits the organizations up into separate Team XML files
generate-dashboard/xslt Turns the XML files into HTML

Use Docker

NOTE: Experimental; feedback via GitHub issues much appreciated.

You can run oss-dashboard with Docker or docker-compose.

If you have already set up postgres server for oss-dashboard, you should run oss-dashboard by Docker.
If you don't have any postgres server, you should run oss-dashboard by docker-compose. docker-compose run oss-dashboard and postgres container at the same time.

Docker

  docker build -t oss-dashboard .
  docker run \
    -e GH_ACCESS_TOKEN=${GH_ACCESS_TOKEN} \
    oss-dashboard

If you configure dashboard/github settings, you write configuration files (config-dashboard.yaml and config-github.yaml) in this root dir. (See Setup section about the contents of configuration files)

Then execute the following commands.

  docker build -t oss-dashboard .
  docker run \
    -v $PWD/config-dashboard.yaml:/oss-dashboard/config-dashboard.yaml \
    -v $PWD/config-github.yaml:/oss-dashboard/config-github.yaml \
    -v $PWD/data:/oss-dashboard/data \  # if you need data files (specified `data-directory`), you need this line.
    -v $PWD/html:/oss-dashboard/html \  # if you need html files (specified `www-directory`), you need this line.
    oss-dashboard refresh-dashboard.rb --ghconfig config-github.yaml config-dashboard.yaml

If you connect to your organization's GitHub Enterprise, you must specify your GitHub Enterprise API endpoint to OCTOKIT_API_ENDPOINT.

  docker run \
    -v $PWD/config-dashboard.yaml:/oss-dashboard/config-dashboard.yaml \
    -v $PWD/config-github.yaml:/oss-dashboard/config-github.yaml \
    -e OCTOKIT_API_ENDPOINT=https://github.mycompany.com/api/v3/ \
    oss-dashboard refresh-dashboard.rb --ghconfig config-github.yaml config-dashboard.yaml

Docker Compose

Before running oss-dashboard by docker-compose, you need to prepare config-dashboard.yaml (and config-github.yaml if you need) in root dir.
Then you run the following command.

docker-compose up

If you need to use config-github.yaml (see Setup section), you rewrite docker-compose.yml as follows and execute docker-compose up command.

(snip)
    command: refresh-dashboard.rb --ghconfig config-github.yaml config-dashboard.yaml  # Add `--ghconfig` option
    volumes:
       - ./config-dashboard.yaml:/oss-dashboard/config-dashboard.yaml
       - ./config-github.yaml:/oss-dashboard/config-github.yaml   # Stop commentting out
(snip)

Helper Tools

You only get 5000 requests an hour to GitHub, so keeping an eye on your current request count can be important.

  # Instead of providing the file, you can set the 
  # GH_ACCESS_TOKEN environment variable with your access token.
  ruby github-sync/util/get_rate_limit.rb --ghconfig {path to config-github.yml}

The following query shows you the size of each of your tables. It needs porting to Ruby so it can take advantage of the config.

  ruby github-sync/queries/db-summary.rb {path to database file}

Large Organizations

Because of that 5000 request limit, loading the data for large organizations can be difficult. While in principle you should be able to repeat run the dashboard until your database is full (at least until you hit a repository that would take greater than 5000 requests), this hasn't been tested and the dashboard does not yet fail gracefully.

Running each phase at a time is advised; chances are you will need to run github-sync/issues repeatedly until full. You can edit the configuration so it only runs on the org you are adding during that manual import, then put the full list back again.

Another approach is to turn on the --xsync flag for issues/commits/releases/events. This uses a queue to synchronize the data rather than trying to do it all in one go. It's not recommended that you use the --xsync flag for metadata as it doesn't cleanly delete old data that has gone from GitHub.

Notes on Output Warnings

By default the refresh_dashboard.rb script outputs '.' characters to show it's taken care of a repository (or whatever the 'atom' being operated on is in that phase). Sometimes it outputs a '!'. Here's why:

  • github-sync/commits - An '!' here means it skipped an empty repository to avoid an Octokit error.

Bootstrap Themes

The HTML generated relies, amongst other libraries, on Bootstrap. The HTML files look for a file named bootstrap-theme.css in the same directory, allowing you to customize the look and feel of the dashboard (typically by finding a theme you like and using that).

About

A dashboard for viewing many GitHub organizations at once.

Topics

Resources

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE.txt
Unknown
license-monkey.rb

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published