Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add REST API, update command line scripts to use REST API #54

Open
hltcoe-bot opened this issue Aug 7, 2020 · 13 comments
Open

Add REST API, update command line scripts to use REST API #54

hltcoe-bot opened this issue Aug 7, 2020 · 13 comments

Comments

@hltcoe-bot
Copy link
Collaborator

The current command line scripts scrape the HTML of the admin pages, and break when the formatting of these pages changes.

This seems to be one of the more popular REST frameworks for Django:

https://www.django-rest-framework.org

Poster: Craig Harman id: 154

@hltcoe-bot
Copy link
Collaborator Author

changed the description

Poster: Craig Harman

@hltcoe-bot
Copy link
Collaborator Author

mentioned in merge request !128

Poster: Craig Harman

@hltcoe-bot
Copy link
Collaborator Author

MTurk has a full API. Do we want to copy parts of it? I haven't checked how extensive it is.

Poster: Cash Costello

@hltcoe-bot
Copy link
Collaborator Author

I'd need to investigate further, but I suspect the answer is yes. It would be nice to be able to leverage third-party tools related to quality control.

If we want to make it easy for other services (such as AT-AT) to submit Tasks to Turkle, I suspect a REST API will be important for buy in. But from my (current) perspective, #143 and #135 are higher near-term priorities.

Poster: Craig Harman

@hltcoe-bot
Copy link
Collaborator Author

mentioned in issue #132

Poster: Cash Costello

@hltcoe-bot
Copy link
Collaborator Author

mentioned in issue #171

Poster: Thomas Lippincott

@hltcoe-bot
Copy link
Collaborator Author

charman ccostello Is there anything in-progress on this? If not, I'll create a branch and start playing with django-rest-framework integration.

Poster: Thomas Lippincott

@hltcoe-bot
Copy link
Collaborator Author

lippincott discussion on Github about this: #32

To summarize:

  • It is either not possible or very difficult to use boto with Turkle because of additional features that we've added
  • The drf-generator project is not a good fit
  • We will need to create a custom API using DRF
  • The next step is sketching out API endpoints

In additional to sketching out the API, I want to minimize duplicated code. I don't want to reimplement batch creation for the API as separate from what happens when forms are submitted.

Poster: Cash Costello

@hltcoe-bot
Copy link
Collaborator Author

ccostello

Agreed w.r.t. duplicated code.

In addition to task/project management focus of the official API, my interest in a REST API is driven by wanting a simple way to produce backups and manage users/groups without having to SSH into a machine and run cumbersome Docker commands. This would be a modestly nice thing for me, but an extremely useful thing for some other group (like the library) maintaining their own instance with a minimum of hackiness.

It seems good enough to just move in the direction of official-API-compliance by only defining the same commands when they match in functionality, and meanwhile having new, useful commands for these things...?

Poster: Thomas Lippincott

@hltcoe-bot
Copy link
Collaborator Author

When I did the review of the official MTurk API and how we do things, they were different enough that I came away convinced that we should just build out the API that fits our use cases.

We are template compatible with MTurk but we're not really compatible in any other way.

My API endpoints would be

  • CRUD operations on Projects
  • CRUD operations on Batches
  • CRUD operations on Users and Groups
  • Get annotations
  • Get stats

I want to keep backups as database specific rather than supporting that through an API. If we need to provide detailed instructions and sample scripts, let's do that.

Poster: Cash Costello

@hltcoe-bot
Copy link
Collaborator Author

Agreed w.r.t. endpoints.

Not that I disagree necessarily, but why database-specific? It seems like a major advantage of Django's agnosticism is you can easily say "give me a snapshot of the whole system as a JSON blob" (and "here's a JSON blob, roll back the system", or, offline, "show me a diff of these two snapshots", etc).

Poster: Thomas Lippincott

@hltcoe-bot
Copy link
Collaborator Author

Now that I'm looking at it, the official mturk API mostly covers my concerns (there's an endpoint to get an assignment, which includes answers). So, as long as it's possible to perform these smaller requests and in principle back up any user-provided answers, this is fine (and would never involve a serving a huge monolith). I know we're not necessarily going to replicate Amazon's API, but I'm guessing we'll at least have endpoints along the lines of get_projects(), get_hits(project_id), get_assignments(hit_id), get_assignment(assignment_id), and that's plenty of functionality.

Poster: Thomas Lippincott

@hltcoe-bot
Copy link
Collaborator Author

mentioned in merge request !231

Poster: Craig Harman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant