Dataiku has unveiled the newest replace to its information science and machine studying platform, Dataiku 11.1. This replace consists of enhancements to current capabilities in addition to new options for information scientists, ML engineers, and analysts.
Dataiku 11 launched a guided job inside its VisualML framework to simplify growing and deploying time collection forecasting fashions. The 11.1 replace now allows customers to optimize hyperparameters for his or her forecasting fashions. This optimization makes use of a k-fold cross validation technique that respects time ordering and ensures validation folds are each consecutive to coaching units and non-overlapping, in response to Dataiku.
When the k-fold cross validation is activated for binary or multi-class classification duties, a brand new stratified choice splits the samples in the identical proportion as they seem in the entire inhabitants and can be utilized to eradicate sampling bias throughout cross-test validations. The corporate says this strategy permits customers to extra precisely mannequin conditions seen at prediction time, or when customers are modeling on previous information to make predictions with future-oriented information. There are additionally new mannequin comparability technology capabilities for time collection fashions which allow information scientists to check and distinction fashions with measurements like efficiency metrics, time collection resampling settings, options dealing with, and coaching particulars.
Explaining mannequin habits or troubleshooting sudden or incorrect predictions is effective for clarifying mannequin predictions to stakeholders. The Dataiku platform helps explainability by means of its VisualML interpretability functionalities, and for pc imaginative and prescient customers, this has now been enhanced for picture classification and object detection modeling in 11.1. The “What If?” tab now accommodates a visible warmth map illustration that highlights which areas had probably the most affect on the mannequin’s prediction. When hovering over photographs for every predicted class, the warmth map is overlaid on the scored picture to see precisely which pixels the mannequin targeted on for its prediction.
The platform’s explainability options are additionally now obtainable for externally sourced fashions introduced into Dataiku by means of the MLflow integration: “Information scientists can now compute partial dependence to see how the mannequin is influenced by values throughout every variable, subpopulation evaluation to trace any potential bias on subsets of information, and particular person explanations to deep dive on excessive possibilities,” Dataiku mentioned in an organization weblog submit.
For individuals who have been utilizing Dataiku’s MLflow integration to import fashions, the reverse is now attainable. Fashions developed in Dataiku 11.1 can now be exported within the open supply MLflow format for ML engineers wishing to deploy fashions outdoors of Dataiku. Customers may immediately export Dataiku fashions to Python code to be used in any Python code outdoors of Dataiku.
Dataiku 11.1 additionally has two new chart sorts for information visualization. There’s now a treemap chart for visualizing relationships and ratios between components in categorical and hierarchical information. A second addition is a KPI chart that shows particular person aggregated options as single numbers with conditional formatting to gauge KPI progress.
Different platform enhancements embody help for extra information connection sorts and desk descriptions, enhanced information exploration, cleaning, and export, and new coding capabilities. Go to the launch notes or a weblog from Dataiku’s Christina Hsiao to learn extra about 11.1.
Associated Objects:
Dataiku 11 Launch Presents Enhanced AI Toolset