Setting Up a New Loader
The TAVI framework provides a plugin-based architecture for handling different scan file formats through the Loader system. This guide explains how to create and register a new loader so it gets automatically picked up by the LoaderRegistry and integrated into the RawScanClassifier flow.
Overview
The loader system consists of three main components:
LoaderInterface: Abstract interface that all loaders must implement
LoaderRegistry: Singleton that manages and coordinates all loaders
RawScanClassifier: Uses the registry to automatically classify scan file types
The classification flow works as follows:
File Path → RawScanClassifier → LoaderRegistry → All Loaders
↓
Each loader scores the file
↓
Highest scoring loader wins
↓
File Type Identified
Step 1: Create Your Loader Class
Create a new Python file in src/tavi/library/storage/loader/ directory. Your loader must inherit from AbstractLoader.
Example: Creating a loader for a hypothetical “MyFormat” scan type
"""MyFormat scan file loader."""
from typing import Any
from tavi.library.data.enum.raw_scan_type import RawScanType
from tavi.library.data.scan import Scan, ScanData, ScanMetadata
from tavi.library.storage.interface.file_store_interface import FileStoreInterface
from tavi.library.storage.loader.interface.base import AbstractLoader
class MyFormatLoader(AbstractLoader):
"""Loader for MyFormat scan files."""
def __init__(self, filestore: FileStoreInterface) -> None:
"""Initialize MyFormat loader."""
super().__init__(filestore)
def load(self, path: str) -> Scan:
"""Load scan data from file."""
# Implement your file loading logic
pass
def get_scan_type(self) -> RawScanType:
"""Get scan type identifier."""
# Must match the enum value you add in Step 2
return RawScanType.MyFormat
def get_score(self, path: str) -> float:
"""
Score how confident this loader is for a given file.
Return a score between 0.0 and 1.0 where:
- 0.0: This loader cannot handle this file
- 0.1-0.5: Maybe this loader can handle it
- 0.51-1.0: Strongly confident this loader can handle it
The loader with the highest score will be selected.
"""
# Implement your file format detection logic
if path.endswith('.myformat'):
return 1.0 # Strong match
return 0.0
def parse_metadata(self, path: str) -> ScanMetadata:
"""Parse metadata from the file."""
pass
def parse_scan_values(self, path: str) -> ScanData:
"""Parse scan data values from the file."""
pass
def parse_external_metadata(self, path: str) -> dict[str, Any]:
"""Parse any external metadata associated with the file."""
pass
def adapt_scan_data(self, meta: ScanMetadata, values: ScanData) -> Scan:
"""Combine metadata and values into a Scan object."""
pass
Step 2: Add Enum Value for Your Format
Add your format to the RawScanType enum in src/tavi/library/data/enum/raw_scan_type.py:
"""Enumeration of raw scan types."""
from enum import StrEnum
class RawScanType(StrEnum):
"""Enumeration of supported raw scan file types."""
ORNLSpice = "ORNLSpice"
MyFormat = "MyFormat" # Add your format here
NONE = "None"
Step 3: Register Your Loader
Register your loader in the LoaderRegistry singleton located at src/tavi/library/storage/loader/loader_registry.py:
"""Loader registry singleton."""
from neutrons_standard.decorators.singleton import Singleton
from tavi.library.storage.interface.filestore_interface import Filestore
from tavi.library.storage.loader.default_loader import DefaultLoader
from tavi.library.storage.loader.interface.base import AbstractLoader
from tavi.library.storage.loader.my_format_loader import MyFormatLoader # Add import
from tavi.library.storage.loader.ornl_spice_loader import ORNLSpiceLoader
@Singleton
class LoaderRegistry:
"""Registry for managing loaders."""
def __init__(self, filestore: Filestore) -> None:
"""Initialize registry with filestore."""
self.registry: dict[str, AbstractLoader] = {}
self.set_filestore(filestore)
# Register loaders in order of priority (highest first)
self._register_loader(ORNLSpiceLoader(self.filestore))
self._register_loader(MyFormatLoader(self.filestore)) # Add your loader
self._register_loader(DefaultLoader(self.filestore))
# ... rest of the implementation
Important: Register more specific loaders before more general ones. The DefaultLoader should always be last as it returns a score of 0 for everything.
Step 4: Test Your Loader
Create tests to verify your loader:
Scoring: Test that get_score() returns appropriate scores for your format
Loading: Test that load() correctly reads and parses files
Classification: Test that RawScanClassifier correctly identifies your file type
Example test:
from tavi.backend.classification.raw_scan_classifier import RawScanClassifier
from tavi.library.data.enum.raw_scan_type import RawScanType
def test_myformat_classification():
classifier = RawScanClassifier()
result = classifier.get_classification("path/to/file.myformat")
assert result == RawScanType.MyFormat
How Classification Works
When RawScanClassifier.get_classification(file_path) is called:
It retrieves all registered loaders from LoaderRegistry
It calls get_score(file_path) on each loader
It tracks which loader returned the highest score
It returns the get_scan_type() of the winning loader
If no loader scores above 0, it returns RawScanType.NONE (from DefaultLoader)
# Example with multiple loaders
classifier = RawScanClassifier()
# LoaderRegistry contains: [ORNLSpiceLoader, MyFormatLoader, DefaultLoader]
# For "scan.myformat":
# ORNLSpiceLoader.get_score("scan.myformat") → 0.0
# MyFormatLoader.get_score("scan.myformat") → 1.0
# DefaultLoader.get_score("scan.myformat") → 0.0
# Winner: MyFormatLoader
file_type = classifier.get_classification("scan.myformat")
# Result: RawScanType.MyFormat
Best Practices
Implement Robust Scoring: Your get_score() method should be fast and use multiple heuristics, returning a float between 0.0 and 1.0:
Check file extension
Validate magic bytes (file header)
Check for format-specific markers
Examine file structure
Handle Edge Cases: Return 0.0 for files you can’t handle, not negative scores
Document Your Format: Add docstrings explaining what file formats your loader supports
Fail Gracefully: If file parsing fails in load(), raise clear exceptions with context
Keep Score Ranges Consistent:
0.0 = definitely not your format
0.01-0.5 = uncertain
0.51-1.0 = confident match
Example: Real-World Implementation
The ORNLSpiceLoader demonstrates a production-ready implementation:
class ORNLSpiceLoader(AbstractLoader):
"""Loader for ORNL Spice format scan files."""
def __init__(self, filestore: FileStoreInterface) -> None:
"""Initialize ORNL Spice loader with classifier."""
super().__init__(filestore)
self.classifier = RuleBasedClassifier()
self.classification_rules = ORNLSpiceRuleSet()
def get_score(self, path: str) -> int:
"""Get score for scan using rule-based classification."""
# Uses a dedicated RuleBasedClassifier for intelligent scoring
return self.classifier.get_score(path, self.classification_rules)
This example shows how you can: - Use helper classifiers for complex format detection - Leverage rule sets for sophisticated scoring logic - Maintain clean separation of concerns
Troubleshooting
- My loader never gets selected
Check that get_score() returns a higher score than other loaders for your file type (between 0.0 and 1.0)
Verify your loader is registered in LoaderRegistry.__init__()
Ensure get_scan_type() returns the correct enum value
- RawScanType.NONE is returned
All loaders returned score 0.0, or no loaders were registered
Verify your loader’s get_score() implementation returns values between 0.0 and 1.0
Check that your RawScanType enum value exists
- Precommit checks fail
Add module and class docstrings to your loader file
Add docstrings to all public methods
Run ruff format to auto-fix formatting issues
See Also
RuleBasedClassifier - For implementing complex classification logic
Loader Interface:
src/tavi/library/storage/loader/interface/loader_interface.pyExisting Implementations:
src/tavi/library/storage/loader/ornl_spice_loader.py