Modules

Get Sensor, Features, Capabilities metadata from Sensor Observation Service using pytcup. Pytcup connects with SOS to provide metadata details of sensors and features.

context

hdfsread

pytask

datalake

Pytcup datalake is a python3 library which provides a python interface to TCUP Data Lake Service and connectivity with TCUP Spark cluster for distributed Computing. Data Lake Service (DLS) is a set of REST end-points to upload and download files from TCUP Distributed File Systems. These files can be accessed and loaded to TCUP Spark using Pytcup datalake python3 library.

ETL

Pytcup ETL python package is for extracting, transforming and loading data from TCUP Big data store. PyETL will also extract the data from External sources, e.g AWS S3, Azure Blob, external Postgres DB etc. Data integration and Transformation are important requirements for data analysis on TCUP big data. Pytcup ETL could be used, for generic transformation of data and data preprocessing for the Machine learning. The data will be loaded to the distributed Spark and perform ETL operation. These data also can be downloaded and perform ETL on non-distributed environment such as Pandas, Tensorflow, scikit learn etc. The same features can be available with Task service if we deploy the code which is developed in the Notebook.

FE

GPU