Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/address on local #256

Merged
merged 5 commits into from
Aug 16, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,10 @@ token.txt
!/.github/*

# data_store
/data_store
/data_store

# Mac OS
.DS_Store

# Coverage
cov.xml
4 changes: 3 additions & 1 deletion fair/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -324,14 +324,16 @@ def install(debug: bool, force: bool, directory: str, version: str):
@registry.command()
@click.option("--debug/--no-debug", help="Run in debug mode", default=False)
@click.option("--port", help="port on which to run registry", default=8000)
def start(debug: bool, port: int) -> None:
@click.option("--address", help="Address on which to run registry", default="127.0.0.1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this option is working, but I can only get the registry to start on 127.0.0.1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Win 10, Py 10

Works with --address 127.0.0.1 not with 192.168.1.101 my laptop address at the time - gives Bad Request (400) so serving something! The bit, the click CLI is clearly OK.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an issue with allowed hosts: base_settings.py

The address will need adding to the allowed hosts or allowing all hosts.
ALLOWED_HOSTS = ['*']
ALLOWED_HOSTS = ['127.0.0.1', 'localhost', '192.168.0.101']

This can be added to local-settings.py

Or we could import local settings into a new settings file and put it there.

Open to suggestions on where and how we add this in?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi sorry if I jump in.

I would not allow any hosts.

I would edit the Django settings to be like

ALLOWED_HOSTS = []
ALLOWED_HOSTS.extend(
filter(None, os.environ.get("ALLOWED_HOSTS", "").split(","))
)

And then pass allowed host as env variable.

This would allow to have less maintenance overhead in the Django app for future changes.

You could also create a sys-env variable (prod/testing/staging) and have different Django settings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition here is that this feature is going to be used infrequently or by expert users for either running in containers or deploying a "remote" registry. @bruvio's suggestion looks good (with my limited Django). Thanks! I agree that allowing all hosts by default is unlikely to be good practise.

I think if you go with the suggestion, it should be documented somewhere. I guess the other way might be to just document what manual changes are needed to these files to allow the host specified by the CLI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi sorry if I jump in.

I would not allow any hosts.

I would edit the Django settings to be like

ALLOWED_HOSTS = [] ALLOWED_HOSTS.extend( filter(None, os.environ.get("ALLOWED_HOSTS", "").split(",")) )

And then pass allowed host as env variable.

This would allow to have less maintenance overhead in the Django app for future changes.

You could also create a sys-env variable (prod/testing/staging) and have different Django settings.

Thanks @bruvio

Agree this is a good solution, I would probably set allowed hosts in the settings to ['127.0.0.1', 'localhost'] prior to extending it with the environment variable so if the environmental variable is not set, it will still run on both.

I'll put in a PR on the registry and we can add the address to the environment variable in the start registry function

def start(debug: bool, port: int, address: str) -> None:
"""Start the local registry server"""
try:
fdp_session.FAIR(
os.getcwd(),
server_mode=fdp_svr.SwitchMode.USER_START,
debug=debug,
server_port=port,
server_address=address
)
except fdp_exc.FAIRCLIException as e:
if debug:
Expand Down
47 changes: 47 additions & 0 deletions fair/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,15 @@
import enum
import os
import pathlib
import logging

import git
import yaml

import fair.exceptions as fdp_exc

_logger = logging.getLogger("FAIRDataPipeline.Common")

USER_FAIR_DIR = os.path.join(pathlib.Path.home(), ".fair")
FAIR_CLI_CONFIG = "cli-config.yaml"
USER_CONFIG_FILE = "config.yaml"
Expand Down Expand Up @@ -145,6 +148,23 @@ def registry_session_port_file(registry_dir: str = None) -> str:
registry_dir = registry_home()
return os.path.join(registry_dir, "session_port.log")

def registry_session_address_file(registry_dir: str = None) -> str:
"""Retrieve the location of the registry session port file

Parameters
----------
registry_dir : str, optional
registry directory, by default None

Returns
-------
str
path to registry port session file
"""
if not registry_dir:
registry_dir = registry_home()
return os.path.join(registry_dir, "session_address.log")


def registry_session_port(registry_dir: str = None) -> int:
"""Retrieve the registry session port
Expand All @@ -164,6 +184,33 @@ def registry_session_port(registry_dir: str = None) -> int:
"""
return int(open(registry_session_port_file(registry_dir)).read().strip())

def registry_session_address(registry_dir: str = None) -> str:
"""Retrieve the registry session address

Unlike 'get_local_address' within the configuration module, this retrieves the
port number from the file generated by the registry itself

Parameters
----------
registry_dir : str, optional
registry directory, by default None

Returns
-------
str
current/most recent address used to launch the registry
"""
if not os.path.exists(registry_session_address_file(registry_dir)):
_logger.warning("Session Address file not found, please make sure your registry is up-to-date")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could happen with an older registry version running.

_logger.info("Using 127.0.0.1")
return "127.0.0.1"

_address = open(registry_session_address_file(registry_dir)).read().strip()
if _address != "0.0.0.0":
return _address
else:
return "127.0.0.1"


def staging_cache(user_loc: str) -> str:
"""Location of staging cache for the given repository"""
Expand Down
12 changes: 2 additions & 10 deletions fair/configuration/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -438,18 +438,10 @@ def get_local_port(local_uri: str = None) -> int:

def update_local_port() -> str:
"""Updates the local port in the global configuration from the session port file"""
# If the global configuration does not exist or is empty do nothing
if (
not os.path.exists(fdp_com.global_fdpconfig())
or not read_global_fdpconfig()
):
_local_uri = fdp_com.DEFAULT_LOCAL_REGISTRY_URL
else:
_local_uri = get_local_uri()
_old_local_port = get_local_port()
_current_port = fdp_com.registry_session_port()
_current_address = fdp_com.registry_session_address()

_new_url = _local_uri.replace(f":{_old_local_port}", f":{_current_port}")
_new_url = f'http://{_current_address}:{_current_port}/api/'

if os.path.exists(fdp_com.global_fdpconfig()) and read_global_fdpconfig():
_glob_conf = read_global_fdpconfig()
Expand Down
56 changes: 31 additions & 25 deletions fair/identifiers.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,11 @@
ID_URIS = {
"orcid": "https://orcid.org/",
"ror": "https://ror.org/",
"grid": "https://www.grid.ac/institutes/",
}

QUERY_URLS = {
"orcid": "https://pub.orcid.org/v2.0/",
"ror": "https://api.ror.org/organizations?query=",
"grid": "https://www.grid.ac/institutes/",
}


Expand Down Expand Up @@ -89,29 +87,15 @@ def check_ror(ror: str) -> typing.Dict:
metadata from the given ID
"""

_url = f"{QUERY_URLS['ror']}{ror}"
_response = requests.get(_url)

_result_dict: typing.Dict[str, typing.Any] = {}

if _response.status_code != 200:
return _result_dict

if _response.json()["number_of_results"] == 0:
return _result_dict

_name = _response.json()["items"][0]["name"]
_result_dict["name"] = _name
_result_dict["family_name"] = _name
_result_dict["given_names"] = None
_result_dict["ror"] = ror
_result_dict["uri"] = f'{ID_URIS["ror"]}{ror}'
_result_dict = _check_generic_ror(ror)
if _result_dict:
_result_dict["ror"] = ror

return _result_dict


def check_grid(grid_id: str) -> typing.Dict:
"""Checks if valid GRID ID using GRID public api
"""Checks if valid GRID ID using ROR (https://ror.org/) public api
Parameters
----------
grid_id : str
Expand All @@ -121,21 +105,43 @@ def check_grid(grid_id: str) -> typing.Dict:
typing.Dict
metadata from the given ID
"""
_header = {"Accept": "application/json"}
_response = requests.get(f'{QUERY_URLS["grid"]}{grid_id}', headers=_header)
_result_dict = _check_generic_ror(f'"{grid_id}"')
if _result_dict:
_result_dict["grid"] = grid_id

return _result_dict

def _check_generic_ror(id: str) -> typing.Dict:
"""Checks if valid ROR using ROR public api

Parameters
----------
ror : str
ROR to be checked

Returns
-------
typing.Dict
metadata from the given ID
"""

_url = f"{QUERY_URLS['ror']}{id}"
_response = requests.get(_url)

_result_dict: typing.Dict[str, typing.Any] = {}

if _response.status_code != 200:
return _result_dict

_name = _response.json()["institute"]["name"]
if _response.json()["number_of_results"] == 0:
return _result_dict

_id = _response.json()["items"][0]["id"]
_name = _response.json()["items"][0]["name"]
_result_dict["name"] = _name
_result_dict["family_name"] = _name
_result_dict["given_names"] = None
_result_dict["grid"] = grid_id
_result_dict["uri"] = f'{ID_URIS["grid"]}{grid_id}'
_result_dict["uri"] = _id

return _result_dict

Expand Down
8 changes: 5 additions & 3 deletions fair/registry/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ def check_server_running(local_uri: str = None) -> bool:


def launch_server(
port: int = 8000, registry_dir: str = None, verbose: bool = False
port: int = 8000, registry_dir: str = None, verbose: bool = False, address: str = "127.0.0.1"
) -> int:
"""Start the registry server.

Expand All @@ -122,7 +122,9 @@ def launch_server(
" is the FAIR data pipeline properly installed on this system?"
)

_cmd = [_server_start_script, "-p", f"{port}"]
_cmd = [_server_start_script, "-p", f"{port}", "-a", f"{address}"]

os.environ["FAIR_ALLOWED_HOSTS"] = address if not "FAIR_ALLOWED_HOSTS" in os.environ else os.environ["FAIR_ALLOWED_HOSTS"] + f",{address}"

logger.debug("Launching server with command '%s'", " ".join(_cmd))

Expand Down Expand Up @@ -345,7 +347,7 @@ def install_registry(

logger.debug("Using reference '%s' for registry checkout", reference)

_candidates = [t.name for t in _repo.tags + _repo.heads]
_candidates = [t.name for t in _repo.tags + _repo.heads + _repo.remote().refs]

if reference not in _candidates:
raise fdp_exc.RegistryError(
Expand Down
17 changes: 9 additions & 8 deletions fair/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ def __init__(
debug: bool = False,
server_mode: fdp_serv.SwitchMode = fdp_serv.SwitchMode.NO_SERVER,
server_port: int = 8000,
server_address: str = "127.0.0.1",
allow_dirty: bool = False,
testing: bool = False,
local: bool = False,
Expand Down Expand Up @@ -177,7 +178,7 @@ def __init__(

self._load_configurations()

self._setup_server(server_port)
self._setup_server(server_port, server_address)

def purge(
self,
Expand Down Expand Up @@ -242,15 +243,15 @@ def purge(
if os.path.exists(_global_dirs):
shutil.rmtree(_global_dirs)

def _setup_server(self, port: int) -> None:
def _setup_server(self, port: int, address: str) -> None:
"""Start or stop the server if required"""
self._logger.debug(
f"Running server setup for run mode {self._run_mode}"
)
if self._run_mode == fdp_serv.SwitchMode.CLI:
self._setup_server_cli_mode(port)
self._setup_server_cli_mode(port, address)
elif self._run_mode == fdp_serv.SwitchMode.USER_START:
self._setup_server_user_start(port)
self._setup_server_user_start(port, address)
elif self._run_mode in [
fdp_serv.SwitchMode.USER_STOP,
fdp_serv.SwitchMode.FORCE_STOP,
Expand Down Expand Up @@ -278,7 +279,7 @@ def _stop_server(self) -> None:
force=self._run_mode == fdp_serv.SwitchMode.FORCE_STOP,
)

def _setup_server_cli_mode(self, port: int) -> None:
def _setup_server_cli_mode(self, port: int, address: str) -> None:
self.check_is_repo()
_cache_addr = os.path.join(
fdp_com.session_cache_dir(), f"{self._session_id}.run"
Expand All @@ -288,7 +289,7 @@ def _setup_server_cli_mode(self, port: int) -> None:
# If there are no session cache files start the server
if not glob.glob(os.path.join(fdp_com.session_cache_dir(), "*.run")):
self._logger.debug("No sessions found, launching server")
fdp_serv.launch_server(port=port)
fdp_serv.launch_server(port=port, address=address)

self._logger.debug(f"Creating new session #{self._session_id}")

Expand All @@ -301,7 +302,7 @@ def _setup_server_cli_mode(self, port: int) -> None:
# Create new session cache file
pathlib.Path(_cache_addr).touch()

def _setup_server_user_start(self, port: int) -> None:
def _setup_server_user_start(self, port: int, address: str) -> None:
if not os.path.exists(fdp_com.session_cache_dir()):
os.makedirs(fdp_com.session_cache_dir())

Expand All @@ -319,7 +320,7 @@ def _setup_server_user_start(self, port: int) -> None:
)
click.echo("Starting local registry server")
pathlib.Path(_cache_addr).touch()
fdp_serv.launch_server(port=port, verbose=True)
fdp_serv.launch_server(port=port, verbose=True, address=address)

def _pre_job_setup(self, remote: str = None) -> None:
self._logger.debug("Running pre-job setup")
Expand Down
12 changes: 10 additions & 2 deletions tests/test_identifiers.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,28 @@ def test_check_orcid():
assert _data["orcid"] == "0000-0002-6773-1049"
assert not fdp_id.check_orcid("notanid!")

@pytest.mark.faircli_ids
def test_check_generic_ror():
_data = fdp_id._check_generic_ror("049s0ch10")
assert _data["name"] == "Rakon (France)" == _data["family_name"]
assert not "ror" in _data
assert not "grid" in _data
assert not fdp_id.check_ror("notanid!")

@pytest.mark.faircli_ids
def test_check_ror():
_data = fdp_id.check_ror("049s0ch10")
assert _data["name"] == "Rakon (France)" == _data["family_name"]
assert _data["ror"] == "049s0ch10"
assert not fdp_id.check_grid("notanid!")

assert _data['uri'] == "https://ror.org/049s0ch10"
assert not fdp_id.check_ror("notanid!")

@pytest.mark.faircli_ids
def test_check_grid():
_data = fdp_id.check_grid("grid.438622.9")
assert _data["name"] == "Rakon (France)" == _data["family_name"]
assert _data["grid"] == "grid.438622.9"
assert _data['uri'] == "https://ror.org/049s0ch10"
assert not fdp_id.check_grid("notanid!")


Expand Down
Loading