Workflow with Artifacts
This is a basic workflow for handling artifacts in TableVault. To understand the basics of builders and artifacts, please read Core Concepts: Artifacts and the Builders Guide.
1. Creating an Artifact Table and Instance
You can continue from the TableVault repository generated from the Basic Workflow.
tablevault = TableVault(db_dir = "test_tv", author = "dixie")
tablevault.create_table(table_name = "fruit_images", allow_multiple_artifacts = False)
tablevault.create_instance(table_name = "fruit_images")
Setting allow_multiple_artifacts
to False
tells the system that there will only be one artifact repository for the entire folder.
2. A Code Function that Generates Artifacts
You can then populate the code file to import an image, given a type of fruit:
import shutil
def fetch_image_from_string(fruit: str, artifact_dir:str ):
file_path = f'./all_images/{fruit}.png' # pre-existing file
new_file_path = f'{artifact_dir}/{fruit}.png'
shutil.copy(file_path, new_file_path)
return f'{fruit}.png' # return relative path
If you don't have direct access to a text editor on your platform, you can add the code as a string argument, text
in create_code_module()
.
Executing the Example
In order for your code to actually execute, an actual image needs to exist in the file_path
location.
3. A Builder with ~ARTIFACT_STRING~
tablevault.create_builder_file("fruit_images_index")
tablevault.create_builder_file("fetch_image_artifact")
If you don't have direct access to a text editor on your platform, you can add the code as a string argument, text
in create_code_module()
.
4. Execute and Materialize Instance
Strict Checks
Various checks are performed before the table is materialized to ensure everything is configured correctly. Most importantly, each artifact_string
value must have a corresponding artifact file and vice versa.
5. Query for An Artifact Dataframe
You can easily retrieve the dataframe with the full or partial artifact path:
df_1 = tablevault.get_dataframe(table_name = "fruits_table", full_artifact_path = True)
df_2 = tablevault.get_dataframe(table_name = "fruits_table", full_artifact_path = False)
The dataframes should have the expected values: