Binder

HS RDF HydroShare Python Client Resource File Operation Examples


The following code snippets show examples for how to use the HS RDF HydroShare Python Client to manipulate files within a HydroShare Resource.

Install the HS RDF Python Client

The HS RDF Python Client for HydroShare won't be installed by default, so it has to be installed first before you can work with it. Use the following command to install the Python Client from the GitHub repository. Eventually we will distribute this package via the Python Package Index (PyPi) so that it can be installed via pip from PyPi.

!pip install hsclient

Authenticating with HydroShare

Before you start interacting with resources in HydroShare you will need to authenticate.

from hsclient import HydroShare

hs = HydroShare()
hs.sign_in()

Create a New Empty Resource

A "resource" is a container for your content in HydroShare. Think of it as a "working directory" into which you are going to organize the code and/or data you are using and want to share. The following code can be used to create a new, empty resource within which you can create content and metadata.

This code creates a new resource in HydroShare. It also creates an in-memory object representation of that resource in your local environmment that you can then manipulate with further code.

# Create the new, empty resource
new_resource = hs.create()

# Get the HydroShare identifier for the new resource
resIdentifier = new_resource.resource_id
print('The HydroShare Identifier for your new resource is: ' + resIdentifier)

# Construct a hyperlink for the new resource
print('Your new resource is available at: ' +  new_resource.metadata.url)

Resource File Handling

HydroShare resources can have any number of files within them organized within a file/directory structure. File handing operations allow you to manage the content files within a resource.

First, show the list of files within the resource, which is initially empty. The search_aggregations argument tells the client whether you want to look at all of the files in the resource (search_aggregations=True) or if you want to want to only look at files that do not belong to a content aggregation (search_aggregations=False).

# Print the title of the resource and the list of files it contains
print('Working on: ' + new_resource.metadata.title)
print('File list:')
for file in new_resource.files(search_aggregations=True): 
  print(file.name)

Adding Files to a Resource

You may need to add content files to your resource. The examples here upload files from the Example_Files folder that is included with the HydroShare resource that contains these Jupyter Notebook examples. If you are running in your own local Python environment and want to load files from your local machine, you would specify the path to the file(s) on your hard drive. If you want to upload multiple files at once, you can pass multiple file paths separated by commas to the upload() function.

Note that if you upload files that already exist, those files will be overwritten.

# Upload one or more files to your resource 
new_resource.file_upload('Example_Files/Data_File1.csv', 'Example_Files/Data_File2.csv')

# Print the names of the files in the resource
print('Updated file list after adding a file: ')
for file in new_resource.files(search_aggregations=True): 
  print(file.path)

HydroShare also allows you to create a folder heirarchy within your resource. You can use this functionality to keep your content organized, just as you would on your own computer. You can upload files to specific folders within the resource. Paths to folders are specified relative to the "content" directory of the resource.

# First create a new folder
new_resource.folder_create('New_Folder')

# Upload one or more files to a specific folder within a resource
new_resource.file_upload('Example_Files/Data_File2.csv', destination_path='New_Folder')

# Print the names of the files in the resource
print('Updated file list after adding a file: ')
for file in new_resource.files(search_aggregations=True): 
  print(file.path)

Searching for Files within a Resource

If you need to find/get one or more files within a resource so you can download or remove it from the resource, there are several filters available that allow you to return a list of files that meet your search criteria or a single file.

Get a List of Files

Execute a filter to return a list of files within the resource that meet the search critera.

# Get a list of all of the files in the resource that are not part of an aggregation
file_list = new_resource.files()
print('All files that are not part of an aggregation:')
print(*file_list, sep='\n')
print('\n')

# Get a list of all of the files in the resource inclusive of files that are inside 
# content type aggregations
file_list = new_resource.files(search_aggregations=True)
print('All files in the resource:')
print(*file_list, sep='\n')
print('\n')

# Get a list of all of the files within a folder in the resource
# Note that you have to pass the full relative path to the folder you are searching
# because there may be multiple folders within a resource with the same name.
# To get files in the root folder, pass an empty string (folder="")
file_list = new_resource.files(folder="New_Folder")
print('All files within a specific folder:')
print(*file_list, sep='\n')
print('\n')

# Get a list of all files that have a specific extension. This searches all folders
file_list = new_resource.files(extension=".csv")
print('All files with a .csv file extension:')
print(*file_list, sep='\n')
print('\n')

# Filters can be combined
# Get a list of all files in a particular folder that have a specific extension
file_list = new_resource.files(folder="New_Folder", extension=".csv")
print('All files with a .csv file extension in a particular folder:')
print(*file_list, sep='\n')

Search for a Single File

Execute a filter to look for a single file in the resource that meets the search critera.

# Get a single file using its path relative to the resource content directory
file = new_resource.file(path="New_Folder/Data_File2.csv")
print('File retrieved using path:')
print(file)
print('\n')

# Get a single file using its name
# Note that if you have multiple files in your resource with the same name, but in different
# folders, you should search for a particular file using the path parameter to ensure that
# you get the right file
file = new_resource.file(name="Data_File2.csv")
print('File retrieved using name:')
print(file)

Get the Properties of a File

When you use the filters to return a file from a resource, you get back a file object that holds properties of the file.

# Search for a file within a resource
file = new_resource.file(path="New_Folder/Data_File2.csv")

# Print the properties of the file
print('File name: ' + file.name)
print('File extension:' + file.extension)
print('File folder name: ' + file.folder)
print('File path: ' + file.path)
print('File url_path:  ' + file.url)
#print('File checksum:' + file.checksum)

# TODO: The checksum property is not implemented yet

Renaming and Moving Files

You may need to rename or move files once they have been added to a resource. First get the file object and then rename or move it.

# Get a file to rename - use the relative path to the file to make sure you have the right one
file = new_resource.file(path="Data_File2.csv")

# Rename the file to whatever you want
new_resource.file_rename(file, 'Data_File2_Renamed.csv')

# Print the names of the files in the resource
print('Updated file list after adding a file: ')
for file in new_resource.files(search_aggregations=True): 
  print(file.path)

Moving files is similar to renaming. Instead of just changing the file name, change the relative path of the file to move it to the new location within the resource.

# Get a file to move
file = new_resource.file(path="Data_File1.csv")

# Move the file to a different folder
new_resource.file_rename(file, 'New_Folder/Data_File1.csv')

# Print the names of the files in the resource
print('Updated file list after adding a file: ')
for file in new_resource.files(search_aggregations=True): 
  print(file.path)

Downloading Files from a Resource

You can download individual files from an existing HydroShare resource. You can use the filters shown above to specify which file(s) you want to download.

When you call the download() function on an individual file, you can pass a path where you want to save the file as a string. Leaving the path blank downloads the files to the same directory as your Jupyter Notebook.

# Download a single file from a resource
# Note that if you have multiple files within the same resource that have the same name,
# and you want a particular file, you need to specify the relative path to the specific file
file = new_resource.file(path='New_Folder/Data_File1.csv')
new_resource.file_download(file)

If you want to, you can clean up the file that was just downloaded by deleting it using a terminal command.

!rm 'Data_File1.csv'

Removing Files from a Resource

You can also delete files from a resource. In this example, I remove one of the files I added to the resource above. You have to delete each individual file. Make sure you call delete using the path parameter to make sure you are deleting the right file.

# Specify the file you want to delete
file = new_resource.file(path="New_Folder/Data_File2.csv")

new_resource.file_delete(file)

# Print the names of the files in the resource
print("Updated file list after removing file: ")
for file in new_resource.files(search_aggregations=True): 
  print(file.path)

TODO: The following items are being worked on

  • Delete a folder and all of the files within it.
  • Moving a folder.
  • Zip a file or a folder.
  • Rename a folder.
  • Download a folder as a zipped file.