# DataCatalog: _tigapics_

In [None]:
import requests
import pandas as pd

import src.utils as ut

# Setup the root path of the application
project_path = ut.project_path()

# Load the metadata

meta_filename = [
    f"{ut.project_path(1)}/meta/mosquito_alert/tigapics.json",
    f"{ut.project_path(2)}/meta_ipynb/tigapics.html",
]
metadata = ut.load_metadata(meta_filename)

# Get contentUrl from metadata file
ut.info_meta(metadata)

## Dataset: _tigapics_mosquitoalert_

### 1. Distribution by image download from MosquitoAlert webserver

This distribution allows to download individual pictures (adults and sites)
that can be viewed at the Mosquito Alert map webserver given a picture IID.

In [None]:
# Get metadata
contentUrl, dataset_name, distr_name = ut.get_meta(
    metadata, idx_distribution=0, idx_hasPart=0
)

# Make folders for data download
path = f"{project_path}/data/{dataset_name}/{distr_name}"
ut.makedirs(path)

In order to get a picture from the [Mosquito Alert public map](http://webserver.mosquitoalert.com/static/tigapublic/spain.html#/en/)
we should know its file-name (ID hash and file extension). The simplest way
to get one is to check visually the map, for example as given below:

In [None]:
# Set up an picture ID file-name and get the relative URL address
ID_PICNAME = "a67c2ad2-09b6-4dbe-9cd0-2536a91e17f3.jpg"
contentUrl_pic = contentUrl.format(ID_PICNAME=ID_PICNAME)


Note that this particular ID corresponds to the below displayed mosquito adult
picture available on the [Mosquito Alert public map](http://webserver.mosquitoalert.com/static/tigapublic/spain.html#/en/).

<p align="center">
  <img src="http://webserver.mosquitoalert.com/media/tigapics/a67c2ad2-09b6-4dbe-9cd0-2536a91e17f3.jpg" alt="Aedes albopictus" height="400"/>
</p>

Another way to get a picture ID is to get the _reports_ dataset,
where the measured variable _movelab_annotation_ is a dictionary with key
_photo_html_ that gives photo _href_ (ID) relative to a given report.

In [None]:
# Download the picture and save it
r = requests.get(contentUrl_pic)
with open(f"{path}/{ID_PICNAME}", "wb") as f:
    f.write(r.content)

## Dataset: _tigapics_labels_bioimagearchive_

### 1. Distribution by image download from BioStudies repository

This dataset distribution allows to download the labels that describe the
_tigapics_bioimagearchive_ dataset.

In [None]:
# Get metadata
contentUrl, dataset_name, distr_name = ut.get_meta(
    metadata, idx_distribution=0, idx_hasPart=1
)

# Make folders for data download
path = f"{project_path}/data/{dataset_name}/{distr_name}"
ut.makedirs(path)

In [None]:
# Download the file list of labels and save it
r = requests.get(contentUrl)
with open(f"{path}/labels.tsv", "wb") as f:
    f.write(r.content)

# Get the labels into a dataframe
df_labels = pd.read_csv(f"{path}/labels.tsv", sep="\t")
df_labels.head()

## Dataset: _tigapics_bioimagearchive_

### 1. Distribution by single image download from the BioImage Archive repository

This distribution allows to download individual pictures of mosquito adults
useful for machine-learning classification tasks.

In [None]:
# Get metadata
contentUrl, dataset_name, distr_name = ut.get_meta(
    metadata, idx_distribution=0, idx_hasPart=2
)

# Make folders for data download
path = f"{project_path}/data/{dataset_name}/{distr_name}"
ut.makedirs(path)

In order to get a picture from the [Mosquito Alert BioImage Archive](https://www.ebi.ac.uk/biostudies/studies/S-BIAD249/)
we should know the relative species and file-name ID labels. This information
is provided by the _tigapics_bioimagearchive_labels_ dataset. For example, we
just take the first entry of this label dataset.

In [None]:
# Set up the picture ID and get the relative URL address
CLASS, ID_PICNAME = df_labels.iloc[0]["Files"].split("/")
contentUrl_pic = contentUrl.format(CLASS=CLASS, ID_PICNAME=ID_PICNAME)

In [None]:
# Download the picture and save it
r = requests.get(contentUrl_pic)
with open(f"{path}/{ID_PICNAME}", "wb") as f:
    f.write(r.content)

### 2. Distribution by image-chunks download from BioStudies repository

This distribution allows to download pictures of mosquito adults in chunks by
classes that may correspond to species.

In [None]:
# Get metadata
contentUrl, dataset_name, distr_name = ut.get_meta(
    metadata, idx_distribution=1, idx_hasPart=2
)

# Make folders for data download
path = f"{project_path}/data/{dataset_name}/{distr_name}"
ut.makedirs(path)

In order to get a picture from the [Mosquito Alert BioImage Archive](https://www.ebi.ac.uk/biostudies/studies/S-BIAD249/)
we should know the relative classes available. One may chose between the follwing
classes: _Aedes_albopictus_, _Aedes_aegypti_, _Aedes_japonicus_, _Aedes_koreicus_,
_Japonicus_koreicus_, _Complex_, _Culex_, _Other_species_ and _Not_sure_.

In [None]:
# Set up the class to download
CLASS = "Aedes_japonicus"
contentUrl_pic = contentUrl.format(CLASS=CLASS)

In [None]:
# Download the picture and save it
r = requests.get(contentUrl_pic)
with open(f"{path}/{CLASS}.zip", "wb") as f:
    f.write(r.content)