Skip to content

Commit

Permalink
Merge pull request IQSS#793 from IQSS/master
Browse files Browse the repository at this point in the history
Beta 3 build, 7/31
  • Loading branch information
kcondon committed Jul 31, 2014
2 parents 626c216 + 8475d47 commit 15f42ae
Show file tree
Hide file tree
Showing 86 changed files with 5,042 additions and 773 deletions.
63 changes: 63 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Contributing to Dataverse

Thank you for your interest in contributing to Dataverse! We welcome contributions of ideas, bug reports, documentation, code, and more!

## Ideas/Feature Requests

The [Dataverse roadmap][] might already capture your idea or feature request but if not, the best way to bring it to the community's attention is by posting on the [dataverse-community Google Group][]. You're also welcome make some noise in the [#dataverse IRC channel][] or cram your idea into 140 characters and mentioning [@thedataorg][] on Twitter.

See also our [Community and Support][] page.

[#dataverse IRC channel]: http://webchat.freenode.net/?channels=dvn
[Dataverse roadmap]: http://datascience.iq.harvard.edu/dataverse/roadmap
[@thedataorg]: http://twitter.com/thedataorg
[Community and Support]: http://datascience.iq.harvard.edu/dataverse/support

## Bug Reports/Issues

An issue is a bug (a feature is no longer behaving the way it should) or a feature (something new to Dataverse that helps users complete tasks). You can browse the Dataverse [issue tracker] on GitHub by open or closed issues or by milestones.

[issue tracker]: https://github.com/IQSS/dataverse/issues

Before submitting an issue, please search the existing issues by using the search bar at the top of the page. If there is an existing issue that matches the issue you want to report, please add a comment to it.

If there is no pre-existing issue, please click on the "New Issue" button, log in, and write in what the issue is. Someone on the Dataverse development team will appropriately tag and assign it to a member of the Dataverse development team.

### Issue Labels

- **Component**: specifies the part of Dataverse the issue relates to
- **Priority**:
- **Critical**: needs to be fixed right away, prevents a user from completing a task
- **High**: it’s a priority to be completed for the assigned milestone
- **Medium**: planned for that milestone, but if needed, it can be re-considered
- **Status**:
- **In Design**: mockups and wireframes are being created
- **In Dev**: being developed
- **In QA**: testing to make sure it is behaving as wanted
- **Type**:
- **Bug**
- **Feature**
- **Suggestion**

### Issue Attachments

You can attach an image or screenshot by dragging and dropping, selecting them, or pasting from the clipboard. This file must be a [supported image format] such as PNG, GIF or JPG; otherwise you will have to include a URL that points to the file in question.

[supported image format]: https://help.github.com/articles/issue-attachments

## Documentation

All of our documentation is in the GitHub repo under the "[doc][]" folder. If you find a typo or inaccuracy or something to clarify, please send us a pull request!

## Code/Pull Requests

To get started developing code for Dataverse, please read our Developer's Guide at http://dataverse-demo.iq.harvard.edu/guides/index.html or in the "[doc][]" folder mentioned above.

[doc]: https://github.com/IQSS/dataverse/tree/master/doc

Before you start coding, please reach out to us either on our [dataverse-community Google Group][], [IRC][], or via [email protected] to make sure the effort is well coordinated and we avoid merge conflicts.

[dataverse-community Google Group]: https://groups.google.com/group/dataverse-community
[IRC]: http://irclog.iq.harvard.edu/dataverse/today

If your pull request is not assigned to anyone in a timely manner, please reach out. The assignee is responsible for evaluating the pull request and deciding whether or not to merge it in. Please try to make it easy to merge in pull requests. Tests are great. :)
Binary file modified doc/Architecture/UsersAndGroups.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion doc/Architecture/UsersAndGroups.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 7 additions & 34 deletions doc/Architecture/UsersAndGroups.uml
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,6 @@
'uncomment for higher dpi
'skinparam dpi 300

note as n2
Invariant: We <b>always</b> have a user
Local users are just users supplied by a local provider
Non-explicit groups, maps a set of
HTTP request properties
to an internal RoleAssignee object
end note


package existingCode {
class Role< DB >
class DvObject< DB >
Expand All @@ -37,34 +28,19 @@ package assignees {
interface User {
}

class AuthenticatedUser {
class AuthenticatedUser< DB > {

}

class IpGroup {
range: IpRange[]
}

class DatabaseUser< DB > {
+ id:Long
+ name:String
}

class RoleAssigneeRecord< DB > {
identifier: String
displayInfo: TBD
}
note left
Locally persistent proxy for the assignee
end note


class ShibUser {
persisstentId: String
shibIdp: String
}

class OauthUser
class LdapUser
class GuestUser {
showInLists: False
}
Expand All @@ -90,6 +66,7 @@ end note

class AuthenticatedUsers
class AllUsers

class ShibGroup {
headerMatchers: Map<String, Regex>
}
Expand All @@ -100,10 +77,6 @@ RoleAssignee <|-- User
RoleAssignee <|-- Group
User <|-- AuthenticatedUser
User <|-- GuestUser
AuthenticatedUser <|-- DatabaseUser
AuthenticatedUser <|-- ShibUser
AuthenticatedUser <|-- OauthUser
AuthenticatedUser <|-- LdapUser
Group <|-- ExplicitGroup
Group <|-- AuthenticatedUsers
Group <|-- AllUsers
Expand Down Expand Up @@ -137,6 +110,7 @@ package roleassigneeprovider {

interface RoleAssigneeProvider {
+ info : RoleAssigneeProviderInfo
isQueryable(): Boolean
acceptsRoleAssigneeIdentifier( idtf:String ): bool
getRoleAssignee( idtf:String ) : RoleAssignee
getRoleAssignee( req:HttpRequest ) : RoleAssignee
Expand All @@ -161,8 +135,8 @@ package roleassigneeprovider {
class DatabaseAssigneeProvider
class ShibAssigneeProvider
class IpAddressAssigneeProvider
class LdapAssigneeProvider
class OAuthAssigneeProvider
class LdapAssigneeProvider < future >
class OAuthAssigneeProvider < future >

RoleAssigneeManager *--> "1..*" RoleAssigneeProvider
RoleAssigneeProvider <|.. OAuthAssigneeProvider
Expand All @@ -182,9 +156,8 @@ package roleassigneeprovider {

}

Group ..> RoleAssigneeProvider : "Created By"
Group <.. RoleAssigneeProvider : "Creates"
User <.. RoleAssigneeProvider : "Creates"
AuthenticatedUser <.. RoleAssigneeProvider : "Creates/Updates"

package somewhere_else_in_dataverse {
class AccessRequest< DB > {
Expand Down
75 changes: 54 additions & 21 deletions doc/Architecture/auth.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,60 @@
# Auth - internal
# Pluggable Authentication and Authorization
## TODO

## System goals

> Go from (User, DvObject, Request) to a set of permissions
> For DvObject - who holds what permissions
## Design Goals
## Concepts
<dl>
<dt>DvObject</dt>
<dd>Short for "Dataverse Object". One of Dataverse, Dataset, or Data File.</dd>
<dt>Role</dt>
<dd>Assumed by assignees on a given <code>DvObject</code>. A role entitles its assignees to a set of permissions.</dd>
<dt>Assignee</dt>
<dd>An entity that can be assigned a role on a given <code>DvObject</code>.</dd>
<dt>User</dt>
<dd>A type of Assignee. An entity that can issue commands. Normally refers to a real person, but can also refer to a group (e.g. <code>GuestUser</code>).</dd>
<dt>Group</dt>
<dd>A set of Assignees. A group may hold explicit reference to its content (e.g. the <code>ExplicitGroup</code> class). A group may also use logic do determine membership (e.g. an <code>IPGroup</code> will look at the IP address the request is coming from, <code>AuthenticatedUsers</code> will check that the user is not a <code>GuestUser</code> instance, a Shibboleth group will match against Shibb headers in the HTTP request). <br/>
A Group is a type of Assignee. This means that groups can contain other groups, which makes them a powerful user management tool.</dd>
</dl>

## Notes About the Design
* Invariant - there's always a user
- Required for auditing, logging, etc.
- Required for the Command Architecture
- Also, required for auditing, logging, etc.
* The design is inspired by JDBC
* Each identity provider has its own "Driver", but all conform to the same high-level interface.
* Assignees have "URL"s, the first part of which identifies the `RoleAssigneeProvider` that created the user.
* A single user object object might refer to a group of actual people (specifically, `GuestUser`).
- Single physical person is referred by subclasses of `AuthenticatedUser`
- A `GuestUser` can be a member of a group, either explicitly (i.e. inclusion in a `ExplicitGroup`) or via other request headers (e.g. guest accessing Dataverse from an IP address included in an `IpGroup` ).
* All groups live in the DB. Their users might not.
* No group membership is cached on our side, to avoid stale data (as in, user removed from institute directory but still has institute permission on Dataverse).
* One cannot go from a user to all the groups said user belongs to. Example: IP group membership is determined at *query* time.
* All groups and users are equal (i.e, Nothing special about Local Users, except that this user provider is bundled with the system)
* JDBC inspired
* Assignees have "URL"s, the first part of which identifies the `RoleAssigneeProvider` that created the user. The suffix of the URL may allow the `RoleAssigneeProvider` to generate the user (e.g. `DatabaseUserProvider`).
- Single physical person is referred by `AuthenticatedUser`
- A `GuestUser` is a regular user, and can be a member of a group, either explicitly (i.e. inclusion in an `ExplicitGroup` or some other group hierarchy) or via other request headers (e.g. guest accessing Dataverse from an IP address included in an `IpGroup` ).
* All groups live in the DB.
* All authenticated users live in the DB as well. Users whose information is maintained on other systems (e.g. Shibboleth, ActiveDirectory) might have their cached information updated when they log in.
* For authenticated users from other systems, group memberships are never cached on the Dataverse side, to avoid stale data (as in, user removed from institute directory but still has institute permission on Dataverse).
- A `RoleAssigneeProvider` may or may not allow Dataverse to issue queries about group memberships of individual users. Users coming from providers that allow such queries will have easier time using API keys, as their group memberships could be validated without them logging in.
* One cannot go from a user to all the groups said user belongs to. Example: IP group membership is determined at *query* time and based on IP address, not on user id.
** The mapping between the user generated by the `RoleAssigneeProvider` to the `AuthenticatedUser` is done using a lookup table, mapping from the user's id (a jdbc url-like string) to an `AuthenticatedUser.id`. In the long run, this will allow multiple logins to map to the same internal system user (e.g. a login with Shibboleth or a Mendeley username ends up with the same user).
- Multiple logins are not planned for 4.0
* All types of groups and users are equal (i.e, Nothing special about Local Users, except that this user provider is bundled with the system)
- In particular, `DatabaseUser` is not a subclass of `AuthenticatedUser`. This allows the internal database-backed assignee provider package to evolve independently of the standard cached assignees. `DatabaseUser`s are associated with an `AuthenticatedUser` using the regular lookup table.
* DvObject access request are sent from `AutehnticatedUser`s. The "to" field is inferred - everyone that has a `Permission.GrantPermissions` permission on said DvObject.

## Issues
* API keys permissions are an issue - permissions from groups whose memberships can't be validated at the time of the API call
* Add `Everyone` group
* Add use cases for users logging in from various places
* Add option for actively querying user providers
- e.g. to allow admins to pro-actively assign roles to users on remote directories
- Might require storage for `RoleAssigneeProvider`
* Update display login on `RoleAssigneeRecord`s when their role assignee logs in.
*

## Pluggability - for 4.0
* The permission set of an API key might be different from its owner permission set, in case some of the permissions come from group memberships that cannot be validated by the system at the time of the call.
- e.g. Shibboleth group membership cannot be dynamically queried (in 4.0), thus membership in local groups that relies on Shibboleth' member-of headers cannot be validated. As this is a security issue, we err on the safe side and assume the user owning the key is not a group member.
- This is only a problem with assignee providers that don't allow Dataverse to query group memberships. So, does not apply to `DatabaseUserProvider`. Might apply to Shibboleth and LDAP.
- Temporary Workaround: Including the user in an `ExplicitGroup` when needed. Note that this will create stale data in the system, and would mean managing user's memberships in the institution directory as well as in Dataverse.
- Solution: Use query-able directories

## What does "Pluggability" means for Dataverse 4.0
* Pull-request based (not full .jar based plugins in a `plugin` directory)
* No "special cases" for different user providers at the back end (*including database schema*)
* UI can have special cases (JSF + backing beans) for each user provider.
- Hopefully, UI will be incorporated into the plug-ins in later versions
* Groups are stored in the `Groups` table. Common fields, defined at the interface level, are normal database fields. Each row holds a reference to an `RoleAssigneeProvider`. Implementation specific data goes in a blob field (e.g. an `IpGroup` can store a JSON string there, with the ranges).

## Activities
Expand All @@ -48,3 +68,16 @@
3. `permissions = permissions U permissions( roles(dvObj, explicit_groups(dvObj,u)) )`
4. If `dvObj` is a permission root, output `permissions`
4. else, `dvObj` &larr; `parent(dvObj)`

## Diagrams
### Class Diagram:
See all involved classes
![Class Diagram](UsersAndGroups.png)

###User login process
![User login](userLogin.png)
Note that Dataverse 4.0 will ship with an internal identity provider, for which the authentication process will be simpler. However, it will follow the same semantics.

###Loading groups
This diagram shows how groups are loaded. Note how the system handles the `RoleAssigneeProvier`s' different data structures in an a opaque way - as far as the system knows, these are just strings that should be passed to the provider.
![Loading groups](loadGroups.png)
Binary file added doc/Architecture/userLogin.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions doc/Architecture/userLogin.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 41 additions & 0 deletions doc/Architecture/userLogin.uml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
@startuml

title Authentication in Dataverse 4.0
autonumber "<font color=blue>"

actor User
participant AuthSystem as "AuthSystem\nMay be external"
box "DataverseSystem" #DDD
participant DataverseUI
participant RoleAssigneeManager
participant RoleAssigneeProvider
participant UserBean
database db
end box

User --> DataverseUI : GET /
User <-- DataverseUI : "Select Login System"
User --> DataverseUI : authSystem
DataverseUI --> RoleAssigneeManager: get( authSystem )
DataverseUI <-- RoleAssigneeManager: authSystem
User <-- DataverseUI: redirect to authSystem
User --> AuthSystem : credentials
User <-- AuthSystem : Ok, back to DataverseUI
User --> DataverseUI : autenticated( data )
DataverseUI --> RoleAssigneeProvider: getUserObj( data )
DataverseUI <-- RoleAssigneeProvider: userObj
DataverseUI --> UserBean : setUser( userObj )
UserBean --> db : lookupAuthenticatedUser( userObj.id )

alt id found
UserBean <-- db : authenticatedUser
UserBean --> db : update( authenticatedUser, userObj )

else id not found
UserBean <-- db : "user not found"
UserBean --> db : createAuthenticatedUser( userObj )
UserBean <-- db : authenticatedUser
UserBean --> db : updateLookupTable( userObj.id, authenticatedUser.id )
end

@enduml
35 changes: 35 additions & 0 deletions doc/Sphinx/source/Installers/dataverse-installer-main.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,41 @@ Installers Guide

**Introduction**

JVM Options
+++++++++++

If you need to change the hostname the Data Deposit API returns:

``asadmin delete-jvm-options "-Ddataverse.fqdn=old.example.com"``

``asadmin create-jvm-options "-Ddataverse.fqdn=dataverse.example.com"``

**Enforce SSL on SWORD**

- Set up connector Apache and Glassfish
``asadmin create-network-listener --protocol http-listener-1 --listenerport 8009 --jkenabled true jk-connector``

- Apache dataverse.conf

Add the following to ``/etc/httpd/conf.d/dataverse.conf``

.. code-block:: guess
# From https://wiki.apache.org/httpd/RewriteHTTPToHTTPS
RewriteEngine On
# This will enable the Rewrite capabilities
RewriteCond %{HTTPS} !=on
# This checks to make sure the connection is not already HTTPS
# RewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]
RewriteRule ^/dvn/api/data-deposit/?(.*) https://%{SERVER_NAME}/dvn/api/data-deposit/$1 [R,L]
# This rule will redirect users from their original location, to the same location but using HTTPS.
# i.e. http://www.example.com/foo/ to https://www.example.com/foo/
# The leading slash is made optional so that this will work either in httpd.conf or .htaccess context
The guide is intended for anyone who needs to install the Dataverse app.

If you encounter any problems during installation, please contact the
Expand Down
Binary file not shown.
8 changes: 8 additions & 0 deletions local_lib/edu/harvard/iq/dvn/unf5/5.0/unf5-5.0.pom
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<modelVersion>4.0.0</modelVersion>
<groupId>edu.harvard.iq.dvn</groupId>
<artifactId>unf5</artifactId>
<version>5.0</version>
</project>
Loading

0 comments on commit 15f42ae

Please sign in to comment.