Sap-skills sap-hana-ml
install
source · Clone the upstream repo
git clone https://github.com/secondsky/sap-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/secondsky/sap-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/sap-hana-ml/skills/sap-hana-ml" ~/.claude/skills/secondsky-sap-skills-sap-hana-ml && rm -rf "$T"
manifest:
plugins/sap-hana-ml/skills/sap-hana-ml/SKILL.mdsource content
SAP HANA ML Python Client (hana-ml)
Package Version: 2.22.241011
Last Verified: 2025-11-27
Table of Contents
Installation & Setup
pip install hana-ml
Requirements: Python 3.8+, SAP HANA 2.0 SPS03+ or SAP HANA Cloud
Quick Start
Connection & DataFrame
from hana_ml import ConnectionContext # Connect conn = ConnectionContext( address='<hostname>', port=443, user='<username>', password='<password>', encrypt=True ) # Create DataFrame df = conn.table('MY_TABLE', schema='MY_SCHEMA') print(f"Shape: {df.shape}") df.head(10).collect()
PAL Classification
from hana_ml.algorithms.pal.unified_classification import UnifiedClassification # Train model clf = UnifiedClassification(func='RandomDecisionTree') clf.fit(train_df, features=['F1', 'F2', 'F3'], label='TARGET') # Predict & evaluate predictions = clf.predict(test_df, features=['F1', 'F2', 'F3']) score = clf.score(test_df, features=['F1', 'F2', 'F3'], label='TARGET')
APL AutoML
from hana_ml.algorithms.apl.classification import AutoClassifier # Automated classification auto_clf = AutoClassifier() auto_clf.fit(train_df, label='TARGET') predictions = auto_clf.predict(test_df)
Model Persistence
from hana_ml.model_storage import ModelStorage ms = ModelStorage(conn) clf.name = 'MY_CLASSIFIER' ms.save_model(model=clf, if_exists='replace')
Core Libraries
PAL (Predictive Analysis Library)
- 100+ algorithms executed in-database
- Categories: Classification, Regression, Clustering, Time Series, Preprocessing
- Key classes:
,UnifiedClassification
,UnifiedRegression
,KMeansARIMA - See:
for complete listreferences/PAL_ALGORITHMS.md
APL (Automated Predictive Library)
- AutoML capabilities with automatic feature engineering
- Key classes:
,AutoClassifier
,AutoRegressorGradientBoostingClassifier - See:
for detailsreferences/APL_ALGORITHMS.md
DataFrames
- Lazy evaluation - builds SQL until
calledcollect() - In-database processing for optimal performance
- See:
for complete APIreferences/DATAFRAME_REFERENCE.md
Visualizers
- EDA plots, model explanations, metrics
- SHAP integration for model interpretability
- See:
for 14 visualization modulesreferences/VISUALIZERS.md
Common Patterns
Train-Test Split
from hana_ml.algorithms.pal.partition import train_test_val_split train, test, val = train_test_val_split( data=df, training_percentage=0.7, testing_percentage=0.2, validation_percentage=0.1 )
Feature Importance
# APL models importance = auto_clf.get_feature_importances() # PAL models from hana_ml.algorithms.pal.preprocessing import FeatureSelection fs = FeatureSelection() fs.fit(train_df, features=features, label='TARGET')
Pipeline
from hana_ml.algorithms.pal.pipeline import Pipeline from hana_ml.algorithms.pal.preprocessing import Imputer, FeatureNormalizer pipeline = Pipeline([ ('imputer', Imputer(strategy='mean')), ('normalizer', FeatureNormalizer()), ('classifier', UnifiedClassification(func='RandomDecisionTree')) ])
Best Practices
- Use lazy evaluation - Operations build SQL without execution until
collect() - Leverage in-database processing - Keep data in HANA for performance
- Use Unified interfaces - Consistent APIs across algorithms
- Save models - Use
for persistenceModelStorage - Explain predictions - Use SHAP explainers for interpretability
- Monitor AutoML - Use
for long-running jobsPipelineProgressStatusMonitor
Bundled Resources
Reference Files
-
(479 lines)references/DATAFRAME_REFERENCE.md- ConnectionContext API, DataFrame operations, SQL generation
-
(869 lines)references/PAL_ALGORITHMS.md- Complete PAL algorithm reference (100+ algorithms)
- Classification, Regression, Clustering, Time Series, Preprocessing
-
(534 lines)references/APL_ALGORITHMS.md- AutoML capabilities, automated feature engineering
- AutoClassifier, AutoRegressor, GradientBoosting classes
-
(704 lines)references/VISUALIZERS.md- 14 visualization modules (EDA, SHAP, metrics, time series)
- Plot types, configuration, export options
-
(626 lines)references/SUPPORTING_MODULES.md- Model storage, spatial analytics, graph algorithms
- Text mining, statistics, error handling
Error Handling
from hana_ml.ml_exceptions import Error try: clf.fit(train_df, features=features, label='TARGET') except Error as e: print(f"HANA ML Error: {e}")
Documentation
- Official Docs: https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.07/en-US/hana_ml.html
- PyPI Package: https://pypi.org/project/hana-ml/