CERN Unified Storage API

Build Status Coverage Status

Aim of the project

IT DB Storage team is facing the need to work with different storage providers in order to deliver different storage solutions that should adapt to different projects needs. So far, a library and a set of scripts has been developed to interact mainly with NetApp storage. The actual solution presents a RESTful API developed using Flask and Python 3.

Usage

Accessing /login will put you through the standard CERN single sign-on process and return you to the API with an authenticated session once you are done. For use in Python scripts and applications, setting the relevant session cookies using cern-sso-python is recommended.

For API documentation, see the generated Swagger documentation (linked below).

Components and Design

The Storage API uses the flask-sso module developed by Invenio. Authorisation is based on CERN e-groups (Active Directory distribution lists), and user authentication is done using the central OAuth2 providers.

The application is designed around common Python and Flask framework patterns, with some help from the flask-restplus framework to provide boilerplate REST code. A factory pattern is used to mount the same APIs mapped to different back-ends as Flask paths. Back-ends are implemented as Flask extensions and modelled using an abstract base class, and resolved at run-time. This way, the API allows for successive implementations of future back-ends on the same API.

Development

Development and testing (without proper authentication) can be done without a full deployment setup (see below). You need Python 3 and virtualenv (on Ubuntu python3-virtualenv) installed before running the commands below.

git clone https://github.com/cerndb/storage-api
cd storage-api
virtualenv --python=python3 v1
source .v1/bin/activate
pip install -r requirements.txt

You can run the API in the Flask development server without SSO like this:

$ export FLASK_APP=app.py
$ export FLASK_DEBUG=1
$ flask run

Accessing /login in debug mode will immediately authorise you as an uber-admin.

There is also a shorthand Makefile option available as make devserver with corresponding make stop.

It is also possible to run the Dockerised image with make image run. Remember that you can determine which random port the container was assigned with docker port <name of container>.

Configuration

Configuration is done using environment variables, as per the 12-factor app pattern. The following environment variables are used:

  • SAPI_OAUTH_CLIENT_ID: The public client ID for the OAuth2 service the storage API runs as
  • SAPI_OAUTH_SECRET_KEY: The private key of the OAuth2 service the storage API runs as

The following keys are optional:

  • SAPI_OAUTH_TOKEN_URL: The URL to get OAuth2 Tokens from. Default: https://oauth.web.cern.ch/OAuth/Token
  • SAPI_OAUTH_AUTHORIZE_URL: The URL to send OAuth2 Authorization requests to. Default: https://oauth.web.cern.ch/OAuth/Authorize
  • SAPI_OAUTH_GROUPS_URL: The URL to send a GET request with the current OAuth2 token to receive a list of groups the current user is a member of. Default: https://oauthresource.web.cern.ch/api/Groups
  • SAPI_OAUTH_CALLBACK_URL_RELATIVE: The relative (to the app’s root) URL to report as the call-back URL to the OAuth2 Authorize server. You most likely do not need to change this, unless you have some really exotic routing pattern. Default: /oauth-done

The following configuration options control the role-based access control:

  • SAPI_ROLE_USER_GROUPS: A comma-separated list of groups whose users are just plain users. If the list is empty or unset, unauthenticated users are included in the user role and given access to certain (non-destructive) endpoints. See the API documentation!
  • SAPI_ROLE_ADMIN_GROUPS: A comma-separated list of groups whose users are (at least) administrators. Empty list or unset variable means the role is disabled.
  • SAPI_ROLE_UBER_ADMIN_GROUPS: A comma-separated list of groups whose users are uber-admins. Empty list or unset variable means the role is disabled.

Please note that roles are distinct, e.g. the admin role is not contained within the uber-admin-role. If you want both, you need to have both.

Back-ends are configured using the following pattern:

  • SAPI_BACKENDS: A unicorn emoji-separated (:unicorn:) list of back-ends to enable, and their configuration as per the following pattern: endpoint_name:BackEndClass:option_name:option_value:another_option_name:another_option_value, where endpoint_name is the part that goes into the /<endpoint_name>/volumes> part of the URL, BackEndClass is the name of a class that implements the corresponding back-end (e.g. DummyStorage), and the following set of rainbow emoji-separated options (:rainbow:) will be passed as keys and value arguments to that back-ends constructor.

    The following example sets up a NetApp back-end and RAM-backed dummy back-end: export SAPI_BACKENDS="dummy🌈DummyStorage🦄netapp🌈NetappStorage🌈username🌈storage-api🌈password🌈myPassword:@;🌈vserver🌈vs3sx50"

    Please note that it is perfectly possible to set up multiple endpoints with the same back-end, e.g. multiple NetApp filers or clusters with different vservers on different endpoints. Endpoints needs to be unique though.

Without at least one configured endpoint, the app will not run.

Testing and Continuous Integration

Continuous integration is provided by Travis, and tests are run using py.test and the Hypothesis framework. A more exhaustive Hypothesis test run takes a while to run (around 10-15 minutes), so it is advised to use the much less exhaustive “dev” test profile and leave the “ci” profile for Travis. The profile can be provided as a command-line option for pytest: pytest --hypothesis-profile=dev. Logging verbosity during tests can be increased using -vvvv with a variable number of v:s (more is noisier).

Deployment

The API is deployed via a standard Docker container to OpenShift. Provided you are authenticated with CERN’s GitLab as a registry in Docker, running make image push-image will compile the image and push it to the registry. If you have also set OPENSHIFT_PUSH_TOKEN to an authentication token with access to pushing images to the the OpenShift project, make deploy-os will finally deploy the new image to OpenShift as a rolling deployment. The entire process can be run with make image push-image deploy-os.

Documentation

An interactive API Documentation is automatically generated by flask-restplus and served at the path /, with a JSON description available at /swagger.json. A HTML version is also rendered by Spectacle and published automatically by Travis to cerndb.github.io/storage-api.